I’m looking for an HTML Parser module for Python that can help me get

Question

0

Asked: June 8, 20262026-06-08T12:31:11+00:00 2026-06-08T12:31:11+00:00

I’m looking for an HTML Parser module for Python that can help me get

0

I’m looking for an HTML Parser module for Python that can help me get the tags in the form of Python lists/dictionaries/objects.

If I have a document of the form:

<html>
<head>Heading</head>
<body attr1='val1'>
    <div class='container'>
        <div id='class'>Something here</div>
        <div>Something else</div>
    </div>
</body>
</html>

then it should give me a way to access the nested tags via the name or id of the HTML tag so that I can basically ask it to get me the content/text in the div tag with class='container' contained within the body tag, or something similar.

If you’ve used Firefox’s "Inspect element" feature (view HTML) you would know that it gives you all the tags in a nice nested manner like a tree.

I’d prefer a built-in module but that might be asking a little too much.

I went through a lot of questions on Stack Overflow and a few blogs on the internet and most of them suggest BeautifulSoup or lxml or HTMLParser but few of these detail the functionality and simply end as a debate over which one is faster/more efficent.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-08T12:31:12+00:00

So that I can ask it to get me the content/text in the div tag with class=’container’ contained within the body tag, Or something similar.

try: 
    from BeautifulSoup import BeautifulSoup
except ImportError:
    from bs4 import BeautifulSoup
html = #the HTML code you've written above
parsed_html = BeautifulSoup(html)
print(parsed_html.body.find('div', attrs={'class':'container'}).text)

You don’t need performance descriptions I guess – just read how BeautifulSoup works. Look at its official documentation.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m looking for an HTML Parser module for Python that can help me get

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply