I’m having a really annoying problem, the answer is probably very simple yet I can’t put 2 and 2 together…
I have an example of a string that’ll look something like this:
<a href="javascript:void(0);" onclick="viewsite(38903);" class="followbutton">Visit</a>
The numbers 38903 will be different every time I load a page. I need a method to be able to parse these numbers every time I load the page. I’ve gotten far enough to grab and contain the piece of HTML code above, but can’t grab just the numbers.
Again, probably a really easy thing to do, just can’t figure it out. Thanks in advance!
If you’re using BeautifulSoup it is dead simple to get just the
onclickstring, which will make this easier. But here’s a really crude way to do it:\Dmatches all non-digits, so this will remove everything in the string that isn’t a number. Then take a slice to get rid of the “0” fromjavascript:void(0).Other options: use re.search to grab series of digits and take the second group. Or use re.search to match a series of digits after a substring, where the substring is
<a href="javascript:void(0);" onclick="viewsite(.Edit: It sounds like you are using BeautifulSoup. In that case, presumably you have an object which represents the
atag. Let’s assume that object is nameda: