I’m trying to open a file with regular HTML and special Unicode characters such

Question

0

Asked: May 27, 20262026-05-27T10:23:04+00:00 2026-05-27T10:23:04+00:00

I’m trying to open a file with regular HTML and special Unicode characters such

0

I’m trying to open a file with regular HTML and special Unicode characters such as “ÖÄÅ öäå” (Swedish), format it and then output it to a file.

So far everything works out great, I can open the file, find the parts I need and output into a file.

But here is the point:

I can’t save the inputted Unicode data into the file without losing my encoding (eg. an ‘ö’ becomes ‘Ã¶’).

Although I can, by manually entering them into the code itself, manage to both perform regex and output them to correct encoding. But not when I’m importing a file, formatting it and then outputting.

Example on working approach when using OCT (eg. this can output to the file without the encoding problem):

my $charsSWE = "öäåÅÄÖ";
# \344 = ä
# \345 = å
# \305 = Å
# \304 = Ä
# \326 = Ö
# \366 = ö
my $SwedishLetters = '\344 \345 \305 \304 \326 \366';

if($charsSWE =~ /([$SwedishLetters]+)/){
    print "Output: $1\n";
}

The way below does not work because the encoding is lost (this is a quick illustration of the part of the code but its concept is the same [eg. open file, fetch and output]):

open(FH, 'swedish.htm') or die("File could not be opened");

    while(<FH>)
    {
        my @List =  /([$SwedishLetters]+)/g;    
        message($List[0]) if @List;
    }

close(FH);

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-27T10:23:05+00:00

Editorial Team

2026-05-27T10:23:05+00:00Added an answer on May 27, 2026 at 10:23 am

use Encode;

open FILE1, "<:encoding(UTF-8)", "swedish.htm" or die $!;

#do stuff

open FILE2, ">:encoding(UTF-8)", "output.htm" or die $!;

You may need to use a different encoding.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to open a file with regular HTML and special Unicode characters such

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply