Trying to transition from urllib in python 2 to python 3. I can output

Question

0

Asked: June 3, 20262026-06-03T07:39:02+00:00 2026-06-03T07:39:02+00:00

Trying to transition from urllib in python 2 to python 3. I can output

0

Trying to transition from urllib in python 2 to python 3. I can output the html source using .urlopen() but I can’t index it using .find() method.

import urllib.request
fh = urllib.request.urlopen("http://stackoverflow.com")
html = fh.read()
fh.close()

print(html.find("<p>"))

I get a type error. I understand that it’s returning a byte-array but I’m pretty fuzzy about what that actually means. I’ve tried a few SO answers like this which have been dead-ends. My question is:

Is there a straightforward, native method to get the page source of a URL as a string in python 3?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-03T07:39:04+00:00

Editorial Team

2026-06-03T07:39:04+00:00Added an answer on June 3, 2026 at 7:39 am

Use html.decode('utf-8') (or whatever encoding it happens to be) to get a str object that you can .find() on.

.decode() is used to take a flat set of bytes and transform them (via reversing a character encoding, such as UTF-8) into a string of actual codepoints (displayable symbols).

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Trying to transition from urllib in python 2 to python 3. I can output

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply