I’m using beautifulsoup to extract images and links from a html string. It all

Question

0

Asked: June 17, 20262026-06-17T03:03:18+00:00 2026-06-17T03:03:18+00:00

I’m using beautifulsoup to extract images and links from a html string. It all

0

I’m using beautifulsoup to extract images and links from a html string. It all works perfectly fine, however with some links that have a tag in the link contents it is throwing an error.

Example Link:

<a href="http://www.example.com"><strong>Link Text</strong></a>

Python Code:

soup = BeautifulSoup(contents)
links = soup.findAll('a')
for link in links:
    print link.contents # generates error
    print str(link.contents) # outputs [Link Text]

Error Message:

TypeError: sequence item 0: expected string, Tag found

I don’t really want to have to loop through any child tags in the link text, I simply want to return the raw contents, is this possible with BS?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-17T03:03:19+00:00

To grab just the text content of a tag, the element.get_text() method lets you grab (stripped) text from the current element including tags:

print link.get_text(' ', strip=True)

The first argument is used to join all text elements, and sitting strip to True means all text elements are first stripped of leading and trailing whitespace. This gives you neat processed text in most cases.

You can also use the .stripped_strings iterable:

print u' '.join(link.stripped_strings)

which is essentially the same effect, but you could choose to process or filter the stripped strings first.

To get the contents, use str() or unicode() on each child item:

print u''.join(unicode(item) for item in link)

which will work for both Element and NavigableString items contained.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m using beautifulsoup to extract images and links from a html string. It all

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply