Assuming a tab-separated values (TSV) file with a header line, how would one create a PHP array with the header fields as the key and the data fields as the data?
Assuming $txtArray contains all the lines in the file,
$hdrArray = explode( "\t", $txtArray[0]);
$i = 0;
foreach ($hdrArray as $hdr) {
$heads[$hdr] = '';
$headerNames[$i++] = $hdr;
}
for ($i = 1; $i < (count($txtArray) - 1); $i++ ) {
$datArray = explode( "\t", $txtArray[$i]);
if (count($datArray) > 1) {
for($j = 0; $j < count($datArray); $j++) {
$heads[$headerNames[$j]] = $datArray[$j];
}
}
# process the line
}
I’ve got $heads containing field_name => field_data for all the fields in each line of the file. Is there a better way to code this?
What qualifies as ‘better’?
You could use a regex split to make it a little more robust but if you have control over the source CSV you shouldn’t have to worry about dirty data.
One obvious optimization I see is to cache the count() result.
Use:
Instead of:
Every time you call count() it’s re-calculating the result. Doing the calculation once should be sufficient so you just save the result.
I don’t see why you need:
If you’re working with ‘clean’ data, it should have a fixed number of values-per-row so counting them and checking for none is unnecessary. To speed things up you could cache the row length by counting the number of rows in the header.
After:
Do:
Then use it in the second for loop:
If you do have to worry about empty rows it would probably be faster to search for an empty line and skip it in the loop.
Like this:
Altogether you get:
I’m assuming your first implementation worked, and the source data is actually CSV (ie fixed number of rows/columns.
All I did was apply some simple (and common) optimizations to cut down on the number of unnecessary calculations. Pretty basic stuff that you get used to seeing after a while.