I am reading some product pages in python/BS4 and find an interesting variety in

Question

0

Asked: June 2, 20262026-06-02T05:01:01+00:00 2026-06-02T05:01:01+00:00

I am reading some product pages in python/BS4 and find an interesting variety in

0

I am reading some product pages in python/BS4 and find an interesting variety in one line of code, the price of the item.

Sometimes the HTML is:

<span class="currency">$<span id="product_price">0.00</span></span>

And other times it will be:

<span class="currency">$17.95</span></b>

Using price = soup.find('span', {'class' : 'currency'})

I can isolate the span, but when I try to get just the text, using

priceStr = price.findAll(text=re.compile(r''))

and then write it to the output file with

divpage.write('Price = ' + str(priceStr) + '\n')

I get (for the first example):

Price = [u'$', u'0.00']

My question is, is there a way to read JUST the price, without the ‘$’, and how do I translate the encoding from the “u’0.00′” to just “0.00”?

I know I can do this using the Python find & replace functions, but I’d like to stick to BSS4 as much as possible, without having to write w check for one form or the other…

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-02T05:01:03+00:00

Editorial Team

2026-06-02T05:01:03+00:00Added an answer on June 2, 2026 at 5:01 am

I would use get_text() instead of find_all()

price_str = price.get_text() # $17.95

Then you can use lstrip to get rid of the dollar sign

price_str = price_str.lstrip('$') # 17.95

And you’re done!

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am reading some product pages in python/BS4 and find an interesting variety in

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply