I’ve got a CSV file with some 600 records where I need to replace some [CRLF] with a [space] but only when the [CRLF] is positioned between two [“] (quotation marks). When the second [“] is encountered then it should skip the rest of the line and go to the next line in the text.
I don’t really have a starting point. Hope someone comes up with a suggestion.
Example:
John und Carol,,Smith,,,J.S.,,,,,,,,,,,,,+11 22 333 4444,,,,,"streetx 21[CRLF]
New York City[CRLF]
USA",streetx 21,,,,New York City,,,USA,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Normal,,My Contacts,[CRLF]
In this case the two [CRLF] after the first [“] need to be replaced with a space [ ]. When the second [“] is encountered, skip the end of the line and go to next line.
Then again, now on the next line, after the first [“] is encountered replace all [CRLF] until the second [“] is encountered. The [CRLF]s vary in numbers.
In the CSV-file the amount of commas [,] before (23) and after (65) the 2 quotation marks [“] is constant.
So maybe a comma counter could be used. I don’t know.
Thanks for feedback.
This will work using one regex only (tested in Notepad++):
Enter this regex in the
Find whatfield:((?:^|\r\n)[^"]*+"[^\r\n"]*+)\r\n([^"]*+")Enter this string in the
Replace withfield:$1 $2Make sure the
Wrap aroundcheck box (andRegular expressionradio button) are selected.Do a
Replace Allas many times as required (until the “0 occurrences were replaced” dialog pops up).Explanation:
Note: The *+ is a possessive quantifier. Use them appropriately to speed up execution.
Update:
This more general version of the regex will work with any line break sequence (
\r\n,\ror\n):((?:^|[\r\n]+)[^"]*+"[^\r\n"]*+)[\r\n]+([^"]*+")