i have a script which will fetch content from a website, what i wanna do is modify all that links. Suppose:
$html = str_get_html('<h2 class="r"><a class="l" href="http://www.example.com/2009/07/page.html" onmousedown="return curwt(this, 'http://www.example.com/2009/07/page.html')">SEO Result Boost <b> </b></a></h2>');
so, is it possible to modify or rewrite it in this way>
<h2 class="r"><a class="l" href="http://www.site.com?http://www.example.com/2009/07/page.html">SEO Result Boost <b> </b></a></h2>
I have read it’s manual but can not understand how to figure it ( http://simplehtmldom.sourceforge.net/#fragment-12 )
Assuming the answer to a related question works,
You should be able to use the following working with Simple HTML DOM
or
Using PHP’s native DOM library:
Checking the $href:
you would be checking for a relative link and prepend the address of the site your pulling the content from, since most sites use relative links. (this is where a regular expression matcher would be your best friend)
for relative links you prepend the absoute path to the site which you are getting links from
for absolute links you just append the relative link
Example links:
site relative:
/images/picture.jpgdocument relative:
../images/picture.jpgabsolute:
http://somesite.com/images/picture.jpg(Note: there is a little more work that needs done here, because if your handling “document relative” links, then you will have to know what directory you’re currently in. Site relative links should be good to go, as long as you have the root folder of the site you’re getting links from)