How do I find a specific div by calling the attributes of a soup? i.e. something like soup.html.body.div however I don’t see how to get the specific div with id='idname' here?
I can do soup.findAll(id='idname')[0] to get the specific tag, but as I understand it this is searching the whole soup.
I imagine getting the div by attribute on the soup would be faster since you are not using findAll()?
Firebug reports the location as being html.body.div[2].form.table[2].tbody.tr[3]... however doing soup.html.body.div[2] gives a key error.
Update:
Say you want to grab the I’m feeling lucky button from http://www.google.com, firebug reports that as being:
/html/body/center/span/center/div[2]/form/div[2]/div[3]/center/input[2]
Is there a way to reach this without using findAll?
The path you get from Firebug is an XPath expression. It’s best to use a parser that lets you use xpath directly. I like using
lxmlwith itsetreeinterface: