I am logging into a site, making a search query and then filtering the results with beautifulsoup to get all the terms in the “b” tag. From the results I’d like to check whether the search term (Testing) is present. My current code is below. The problem I am having is that even when there is a result and the term is present I still get a not present response. I have printed the filtered query and read through it and the result is definitely there so the error is in the searching part. I think the problem is that in the html the word testing isn’t by itself so its Testing.example or Testing.test and so the search can’t find it by it self surrounded by spaces. How do I search for a word/phrase within a longer word/phrase.
I need “Testing” to be found in “example.Testing.example” or in “test.Testing.example”
Hope that makes sense.
Thanks
words = ["Testing"]
br.open ('http://www.example.com/browse.php?psec=2&search=%s' % words)
html = br.response().read()
soup = BeautifulSoup(html)
filtered = soup.findAll('b')
# print filtered
for word in words:
if word in filtered:
print "%s found." % word
else:
print "%s not found." % word
Edit
[<b><a title="Unknown">---</a></b>, <b>Welcome Back<br /><a href="/user/"><
span style="color:#0080FF;"></span></a>!<br /></b>, <b><span class="smallfo
nt"><a href="/messages.php?action=viewmailbox"><img height="14px" style="border:
none" alt="inbox" title="inbox (no new messages)" src="/pic/pn_inbox.gif" /></a>
59 (0 New)</span></b>, <b><span class="smallfont"> <a href="/message
s.php?action=viewmailbox&box=-1"><img height="14px" style="border:none" alt=
"sentbox" title="sentbox" src="/pic/pn_sentbox.gif" /></a> 37</span></b>, <b>Sho
w all</b>, <b><< Prev</b>, <b>Next >></b>, <b>1 - 7</b>, **<b>The.Testing
.example.T3Z6.L</b>**, <b><span style="color:#FF5500;">dgHn</span
></b>, <b><a href="/details.php?id=15829&hit=1&filelist=1">1</a></b>, <b
><a href="/details.php?id=15829&hit=1&=1"><font>30</font></a></
b>, <b><a href="/details.php?id=15829&hit=1&todlers=1">1</a></b>,
When I print filtered I get the above result. Its slightly longer but you get the idea. Five lines from the bottom in **s you see the result that should case a positive but isn’t.
I believe you want something more like the following
BeatifulSoup Documentation is available
here