I’m working on #huge# text files (from 100mb to 1gb), I have to parse them to extract some particoular data. The annoying thing is that the files have not a clearly defined separator.
For example:
"element" 123124 16758 "12.4" "element" "element with white spaces inside" "element"
I have to delete the white spaces in strings limited by ” (quote), the problem is that I must not erase the white spaces “outside” the quotes (otherwise some numbers would merge).
I can’t find a decent sed solution, can someone help me with this?
you use awk, not sed. And there’s certainly no need to create your own C program, as
awkis already an excellent C program to do file processing, even on GB files. So here’s a one liner to do the job.