I have been looking for a way to reformat a CSV (Pipe separator) file with some if parameters, I’m pretty sure this can be done in PHP (strpos and if statements) or using XSLT but wanted to know if this is the best/easiest way to do it before I go and learn my way around a new language. here is a small example of the kind of thing I’m trying to achieve (the real file is about 25000 lines is this changes the answer?)
99407350|Math Book #13 (Random Information)|AB Collings|http:www.abc.com/ABC
497790366|English Book|Harold Herbert|http:www.abc.com/HH
Transform to this:
99407350|Math Book|#13|AB Collings|http:www.abc.com/ABC
497790366|English Book||Harold Herbert|http:www.abc.com/HH
Any advice about which direction I need to look in would be great.
PHP provides getcsv() (PHP 5) and fgetcsv() (PHP 4 and 5) for this, so if you are working in a PHP environment, use that. See e.g. http://www.php.net/manual/en/function.fgetcsv.php
If you do something yourself, remember to cope with “…|…” and/or \| to have | inside a field. Or test to make sure it can’t happen – e.g. check the code that exports the database to CSV if that’s what’s happening.
Note also – on Unix / Solaris / Linux / OS X systems,
awk -F ‘|’ ‘(NF != 9)’ yourfile.csv | wc
will count the number of lines with other than 9 fields; if you are certain | never occurs except as a field delimiter, awk is a perfectly fine language for this too, e.g. with
awk -F ‘|’ ‘{ gsub(/ [(].*[)]/, “”, $1); print}’ yourfile.csv
Here, [(] matches ( in a way that works across different versions of awk, and same for [)].