I’m working on a WordPress database, and I need some help cleaning up the post_content field.
There are about 5,000 posts that contain something like this:
RANDOM JUNK<img src="http://domain.tld/randomString.jpg" />MORE RANDOM JUNK
or
RANDOM JUNK<img src="http://domain.tld/randomString.png" />MORE RANDOM JUNK
or
RANDOM JUNK<img src="https://domain.tld/randomString.jpg" />MORE RANDOM JUNK
or
RANDOM JUNK<img src="https://domain.tld/randomString.png" />MORE RANDOM JUNK
I need to delete everything except the image, and there might be other HTML tags in some of the fields.
Where should a SQL beginner start?
It is not possible with sql. sql could just find a regexp and tell its there, but not capture a part of it and insert it again.
You need to query all images first then match your adresses then insert it again…