Ok so I have a text file that will change regularly that I need to scrape to display on screen and potentially insert into a database. The text is formatted as follows:
"Stranglehold"
Written by Ted Nugent
Performed by Ted Nugent
Courtesy of Epic Records
By Arrangement with
Sony Music Licensing
"Chateau Lafltte '59 Boogie"
Written by David Peverett
and Rod Price
Performed by Foghat
Courtesy of Rhino Entertainment
Company and Bearsville Records
By Arrangement with
Warner Special Products
I only need the song title (the information between the quotes), who it is written by and who it is performed by. As you can see the written by lines can be more than one row.
I’ve searched through the questions and this one is similar Scraping a plain text file with no HTML? and I was able to modify the solution https://stackoverflow.com/a/8432563/827449 below so that it will at least find the information between the quotes and put those in the array. However I can’t figure out where and how to put the next preg_match statements for the written by and performed by so that it will add it to the array with the correct information, assuming I have the right regex of course. Here is the modified code.
<?php
$in_name = 'in.txt';
$in = fopen($in_name, 'r') or die();
function dump_record($r) {
print_r($r);
}
$current = array();
while ($line = fgets($fh)) {
/* Skip empty lines (any number of whitespaces is 'empty' */
if (preg_match('/^\s*$/', $line)) continue;
/* Search for 'things between quotes' stanzas */
if (preg_match('/(?<=\")(.*?)(?=\")/', $line, $start)) {
/* If we already parsed a record, this is the time to dump it */
if (!empty($current)) dump_record($current);
/* Let's start the new record */
$current = array( 'id' => $start[1] );
}
else if (preg_match('/^(.*):\s+(.*)\s*/', $line, $keyval)) {
/* Otherwise parse a plain 'key: value' stanza */
$current[ $keyval[1] ] = $keyval[2];
}
else {
error_log("parsing error: '$line'");
}
}
/* Don't forget to dump the last parsed record, situation
* we only detect at EOF (end of file) */
if (!empty($current)) dump_record($current);
fclose($in);
Any help would be great as I am now over my head with my limited PHP and Regex knowledge.
How about:
output: