I have the following data in a plain text file:
1. Value
Location : Value
Owner: Value
Architect: Value
2. Value
Location : Value
Owner: Value
Architect: Value
... upto 200+ ...
The numbering and the word Value changes for each segment.
Now I need to insert this data in to a MySQL database.
Do you have a suggestion on how can I traverse and scrape it so I can get the value of the text beside the number, and the value of “location”, “owner”, “architect” ?
Seems hard to do with DOM scraping class since there is no HTML tags present.
That will work with a very simple stateful line-oriented parser. Every line you cumulate parsed data into an array(). When something tells you’re on a new record, you dump what you parsed and proceed again.
Line-oriented parsers have a great property : they require little memory and what’s most important, constant memory. They can proceed with gigabytes of data without any sweat. I’m managing a bunch of production servers and there’s nothing worse than those scripts slurping whole files into memory (then stuffing arrays with parsed content which requires more than twice the original file size as memory).
This works and is mostly unbreakable :
Obvously you’ll need something suited to your taste in
function dump_record, like printing a correctly formated INSERT SQL statement.