After using a caching plugin to fix numerous hotlinks, some of the generated html saved to the database is not quite right. For example:
<a href="http://www.mbird.com/wp-content/uploads/2011/04/psycho_blanket.jpg"><img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer; width: 164px; height: 251px;" src="http://www.mbird.com/wp-content/uploads/2011/04/psycho_blanket1.jpg" alt="" id="BLOGGER_PHOTO_ID_5306768463834252178" border="0"></a>
Other times there is an additional 2 before the extension. Other times there is a 21.
As you can see, the href and src don’t agree. The href is right.
Suggestions for how to fix? I’m guessing I need to do a regex against linked images in post_content to test for this? I don’t have much experience with regex in php, and need some help.
$posts = get_posts();
foreach( $posts as $post ) {
// retrieve content of post; same as $post->post_content
$content = $post['post_content'];
// do stuff that I'm unsure about with $content to hone in on linked images with mismatched filenames and fix them
// write it back
$post['post_content'] = '$content;
// Update the post into the database
wp_update_post( $my_post );
}
This tested regex solution should do it:
Given an IMG element wrapped inside an A element, this code replaces the SRC attribute of the IMG element with the HREF attribute of the A element. It assumes that all the HREF and SRC attribute values are wrapped in double quotes.