I downloaded a page with cURL and parsed the html with the “PHP Simple HTML DOM Parser”.
The issue is when it displays the outer html of the element, the Spanish characters are incorrect.
For example:
The original text
la puja por la compra de los derechos de publicación ha sido la más
reñida del año.
The displayed text
la puja por la compra de los derechos de publicación ha sido la más
reñida del año.
What would cause the letters to changed?
I’m pretty sure that because it’s appearing as multiple characters in the output this is occuring because you’re trying to display some multi-byte UTF8 characters in a single-byte charset (probably ISO-8859-1).
Have a look at this blog post that I wrote a while ago which should talk you through all of the potential problem areas.