I’m a newbie in Python looking to build a screen scraper in Scraperwiki but I’m struggling with an error I can’t work out how to fix.
Essentially, I want to parse an xml file but can’t work out how to have my gp_indicators_scrape function access the getroot() method.
Can anyone fix it, and more importantly, point me towards an explanation so I can avoid the problem in future?
Here’s the scraper: https://scraperwiki.com/scrapers/choiceshu1
The key bits of code:
import lxml.html
import urlparse
from urlparse import urlparse
from lxml.etree import etree
def gp_indicators_scrape(org_URL):
indicator_xml = etree.parse(org_URL)
root = lxml.etree.getroot(indicator_XML)
print root
html = scraperwiki.scrape(combined_URL_for_first_scrape)
print html
root = lxml.html.fromstring(html)
links = root.cssselect("dd a")
And here’s the error when it runs
Line 5 - from lxml.etree import etree
ImportError: cannot import name etree
from lxml.etree import etreeshould befrom lxml import etreeAlso, just noticed –
lxml.etree.getroot(...)– you can drop thelxml.if you use the import above, and normally you callgetroot()on the object returned viaetree.parse(or similar).NB: I haven’t looked at code in the provided link…