I’m using BeautifulSoup to scrape a Swedish web page. On the web page, the

Question

0

Asked: June 14, 20262026-06-14T02:26:54+00:00 2026-06-14T02:26:54+00:00

I’m using BeautifulSoup to scrape a Swedish web page. On the web page, the

0

I’m using BeautifulSoup to scrape a Swedish web page. On the web page, the information I want to extract looks like this:

"Öhman Företagsobligationsfond"

When I print the information from the Python script it looks like this:

"Ã&ndash;hman FÃ¶retagsobligationsfond"

I’m new to Python and I have searched for answers and tried using # -- coding: utf-8 -- in the beginning of the code but it does not work.

I’m thinking of moving from Sweden to solve this issue.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-14T02:26:56+00:00

When using # -- coding: utf-8 -- you only specify the encoding of the source code document. The page that you are parsing has probably declared a faulty encoding (or none at all), and therefore Beautiful Soup fails. Try to specify the encoding when building the soup. Here’s a small example:

markup = '''
<html>
    <head>
        <title>Övriga fakta</title>
        <meta charset="latin-1" />
    </head>
    <body>
        <h1>Öhman Företagsobligationsfond</h1>
        <p>Detta är en svensk sida.</p>
    </body>
</html>
'''

soup = BeautifulSoup(markup)
print soup.find('h1')

try:
    # Version 4
    soup = BeautifulSoup(markup, from_encoding='utf-8')
except TypeError:
    # Version 3
    soup = BeautifulSoup(markup, fromEncoding='utf-8')

print soup.find('h1')

The output from this is:

<h1>Ãhman FÃ¶retagsobligationsfond</h1>
<h1>Öhman Företagsobligationsfond</h1>

In Beautiful Soup 4, the parameter is from_encoding, while in version 3, the parameter is fromEncoding.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m using BeautifulSoup to scrape a Swedish web page. On the web page, the

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply