I’m working on a site for a client who did not properly sanitize some of their form data. In particular, they did not do anything to account for line-breaks received from textareas. So, in hundreds of db rows there are \r\n and \r\n\r\n and \r\n\r\n\r\n… you get the idea. I need to somehow extract only the rows which have \r\n to clean up the data. I can’t for the life of my though, build a query which selects these rows! I realize that I must need to escape the slash somehow, but I’m not doing so successfully.
I have tried the following:
SELECT Id,Description FROM lakes WHERE Description LIKE '%\r\n%';
SELECT Id,Description FROM lakes WHERE Description LIKE '%\\r\\n%';
SELECT Id,Description FROM lakes WHERE Description LIKE '%\\\r\\\n%';
SELECT Id,Description FROM lakes WHERE Description LIKE '%\\\\r\\\\n%';
I can of course continue to add slashes, but I feel like I’m probably missing something important if 4 slashes isn’t doing the trick.
Each of these queries returns results, but none of the results are ever the one’s with the \r\n in them. The last query (with 4 slashes) did return 2 rows which did have the \r\n, along with 138 additional rows which didn’t.
Also, because I know someone will mention it, I do know about nl2br(). For some reason it is not working with the raw data from the table. For now, I would be happy being able to pull these rows and clean them up.
Thanks in advance!
Josh
This is an example of something that SQL really isn’t good at.
Using PHP write a simple script that returns all the rows ( or 20 rows at a time) and use a php regular expression to find the ones you are after. Then do whatever you want with the data and update the database. SQL isn’t sophisticated enough to do what you want, its much easier to do in PHP.