I fetch lines of UTF-8 text from a page then dump into a file. The text in the original page appears fine. However, the text in the output file appears scrambled!
My attempt:
$myFile = "testFile.txt";
$fh = fopen($myFile, 'w') or die("can't open file");
$pageContent = file_get_contents("page.html");
//Here: use regex to grab the title ...
$stringData = $title."\n";
fwrite($fh, utf8_encode($stringData));
fclose($fh);
Before writing anything to the file. I saved the file as UTF-8 and i also saved it as Unicode, i still get scrambled text as:
ÊãäíÇÊí ááÌãíÚ
I’m not using PHP5
Any help will be appreciated…
Don’t use
utf8_encode!Sorry for the shouting, it’s just misused way too often.
Your text is already in UTF-8.* You do not need to encode it to UTF-8 again.
utf8_encodeconverts Latin1 encoded text to UTF-8. Your text is not Latin1 encoded. That’s why it screws up. Just read and write the text, done. No encoding conversion or re-encoding necessary.* Assuming
page.htmlis encoded in UTF-8. From what you’re saying, it seems to be.