i am trying to parse an HTML document, but bs4 fail to parse attribute in a specific tag:
<select class="inputNormal" id="TipoImmobileDaNonImportare" name="TipoImmobileDaNonImportare" style="width:100%">
<option value=""></option>
<option value="unità immobiliare urbana">unità immobiliare urbana</option>
<option value="particella terreni">particella terreni</option>
</select>
when i print, the error
AttributeError: 'tuple' object has no attribute 'items'`
the tag and attribute i print:`select: (u'style', u'class', u'name')`
instead of (for example): `input: {u'type': u'hidden', u'name': u'Immobile_Note', u'value': u'Ubicazione occupazione', u'id': u'Immobile_Note'}`
UPDATE:
if i try soup.find_all( attrs= {'id' : 'somevalue' } ) it fail because try access all attributes of tree!
If i try:
s = BeautifulSoup( """<select class="inputNormal" id="TipoImmobileDaNonImportare" name="TipoImmobileDaNonImportare" style="width:100%">
<option value=""></option>
<option value="unità immobiliare urbana">unità immobiliare urbana</option>
<option value="particella terreni">particella terreni</option>
</select>""")
The parser detect it correctly:
select: {'id': 'TipoImmobileDaNonImportare', 'style': 'width:100%', 'class': ['inputNormal'], 'name': 'TipoImmobileDaNonImportare'}
i try to parse it with lxml parser and html5lib parser, but the result is the same.
Thanks for any replies.
EDIT:
thanks to Amanda, but there was an error in my code, i try to store in tag.attrs a touple object because this code is porting from bs3 to bs4!
Thanks.
I’m not entirely sure what you’re trying to access with Beautiful Soup here, but if you want to get at the attributes for the select or the options, you can do something like:
You can show the attributes of the first “select” with:
Or show the attributes of all the options with:
Or, if you’re looking for the names of available items, use:
or if you want the option value rather than the displayed text, use:
If that doesn’t help, maybe you could give an example of the output you’re expecting