I have the following code. html = urllib2.urlopen( ‘https://ebet.tab.co.nz/results/CHCG-reslt05070400.html’).read() soup = BeautifulSoup(html) data =

Question

0

Asked: June 3, 20262026-06-03T20:46:03+00:00 2026-06-03T20:46:03+00:00

I have the following code. html = urllib2.urlopen( ‘https://ebet.tab.co.nz/results/CHCG-reslt05070400.html’).read() soup = BeautifulSoup(html) data =

0

I have the following code.

html = urllib2.urlopen(
    'https://ebet.tab.co.nz/results/CHCG-reslt05070400.html').read()


soup = BeautifulSoup(html)
data = soup.findAll('div', {'class' : 'header bold'})
match = re.search('R', data[0].text)
race_title = data[0].text[(match.start()):]
race_title = str(race_title.strip(' \t\n\r'))
print race_title

The output I get on the screen in the console is below

Race 1 PEDIGREE ADVANCE SPRINT
                C0
                295 m

I thought strip would get rid of any type of spaces between SPRINT and C0 but obviously I am missing something so I need help understanding this result. Is it because the bs4 output the string in unicode or something?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-03T20:46:05+00:00

Editorial Team

2026-06-03T20:46:05+00:00Added an answer on June 3, 2026 at 8:46 pm

strip() removes only leading or trailing characters. if you want to remove the newlines you should use replace("\n","")

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have the following code. html = urllib2.urlopen( ‘https://ebet.tab.co.nz/results/CHCG-reslt05070400.html’).read() soup = BeautifulSoup(html) data =

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply