I have a txt file that is basically in address form, like so:
John Smith
123 Address Way
Blah Blah Blah
Each block of text is followed by 3 blank lines (which I want). However, some of the addresses in the file are missing data, thus they are blank like so:
John Smith
123 Address Way
Blah Blah Blah
I want to keep the multiple (3) blank lines after each data, but I want to delete only the single blank lines.
Anybody have any ideas? All the stuff on google I’ve found relates to deleting multiple blank lines, or all blank lines… the opposite of what I need.
When you have one of these problems, and the file is not gigantic, one of the best tools for the job is perl in
undef $/mode, which makes it read the entire file as one big string; this allows you to match\njust like any other character.At the character level, assuming there is no trailing horizontal whitespace on any line, a blank line is two newline characters in a row; two blank lines is three newline characters, and so on. To delete a blank line, you delete one of the two newline characters. Now, if you just write
s/\n\n/\n/g, that will do more than you want, because\n\nwill match pairs of newlines within longer runs of newlines. So you need a construct that will match two newlines in a row but only if they are not preceded or followed by more newlines. This is what look-around assertions are for.should do the job. It will have the side effect of deleting trailing whitespace, if any, from every line of the file. If you want to delete double blank lines as well as single blank lines (but still not triple blank lines), you just have to adjust the middle of the second RE: