I have stream of data coming from different feeds which I need to clean up.
Data is in specific format and if some sentence spans through multiple lines it is separated using “\”(backslash), which I want to remove. \ is also present in other part of text for escaping quotes etc and I don’t want to remove these backslashes. So eventually I want to remove “\\n”.
I have tried following regex for removing \ and \n but it didn’t work :
singleLine.replaceAll("(\\\\n|\\\\r)", "");
I am not sure what regex would work in this case.
Regex isn’t really necessary for this; If I were you, I would use…
Many people think the replace method only replaces one, but in fact the only difference is that replaceAll uses regex, while replace simply replaces exact matches of the String.
If you do want to use regex though, I believe you have to do \\\\\\\\ (you have to ‘nullify’ the escape character in Java, and in regex, so x4, not just x2)
Explaining this some more
The only other issue is in your example, you never set singeLine equal to anything; I’m not sure if you hid that, or missed that.
Edit:
Explaining the reasoning for \\\\\\\\ some more, Java requires that you do “\\” to represent one \. Regex also has a use for the \ character, and requires you do the same again for it. If you just “\\” in Java, the regex parser essentially receives “\”, it’s escape character for certain things. You need to give the regex parser two of them, to escape it, so in Java, you need to do “\\\\” just to represent a match for a single “\”