I have a set of 10 CSV files, which normally have a an entry of this kind
a,b,c,d d,e,f,g
Now due to some error entries in this file have become of this kind
a,b,c,d d,e,f,g ,,, h,i,j,k
Now I want to remove the line with only commas in all the files. These files are on a Linux filesystem.
Any command that you recommend that can replaces the erroneous lines in all the files.
It depends on what you mean by replace. If you mean ‘remove’, then a trivial variant on @wnoise’s solution is:
Note that this deletes just those lines with exactly three commas. If you want to delete mal-formed lines with any number of commas (including zero) – and no other characters on the line, then:
There are endless other variations on the regex that would deal with other scenarios. Dealing with full CSV data with commas inside quotes starts to need something other than a regex machine. It can be done, within broad limits, especially in more complex regex systems such as PCRE or Perl. But it requires more work.
Check out Mastering Regular Expressions.