I am trying to write a shell/perl command which will give me the row numbers, which has number of fields less than a certain count.
E.g. I have a comma-delimited text file. I am trying to find those rows which has less than, say 15, fields. So I guess the problem essentially boils down to returning rows which has less than 14 commas.
Can anyone help me with that?
Thanks!
You can do this easily in bash by calling awk. This sort of script is exactly what awk was designed to do.
-F,tells awk to split each line on the comma char, ANDNF(Number_of_Fields) indicates how many fields where split in each line. Change the 15 value as needed to help you validate your files.Don’t forget that CSV files may have commas embedded inside the fields if the field is surrounded by quotes, i.e.
Solving that problem is significantly harder Use a tab char to separate your fields (or some other character you can be sure will never appear in your data), and then sleep easy at night 😉
I hope this helps.