I am making a script that gets the content and images of blog posts using DOM and regular expressions.
The script is finished except the following. My aim is to get the content (it is done) all the post’s images EXCEPT THE FIRST and add them to the new content with value varcontent1, 2, 3 and so on.
The script runs 25 times (the number of posts in page), and there is a variable $i. The following code gets the current post content and saves it to $varcontent1. Also it gets all the images of the whole site (with a list of bad words) and prints them as an array.
My question is how can I save the current images to the current post? Finally I will transform them to <img src="xxxx"> (I think I know how to do it).
UPDATED: the results will be submitted to a form. What if I put the current images URLs to a new post variable?
Note: I can get the images with DOM because I load the page, not loadHTML.
preg_match_all('!http://.+\.(?:jpe?g|png|gif)!Ui', $content, $matches);
preg_match_all('/\S+(list|of|bad|words)\S+/i', $content, $bads);
$filtered = array_values(array_diff($matches[0], $bads[0]));
Try using offset…
Don’t use 1,2,3… use arrays…
When reading posts…