I’m using python to write a crawler, since I need to parse html so

Question

0

Asked: June 7, 20262026-06-07T09:04:25+00:00 2026-06-07T09:04:25+00:00

I’m using python to write a crawler, since I need to parse html so

0

I’m using python to write a crawler, since I need to parse html so I import lxml but it comes out an wierd error:

<type 'dict'>
{'xpath': '//ul[@id="i-detail"]/li[1]', 'name': u'\u6807\u9898'}

<type 'dict'>
{'xpath': '//ul[@id="i-detail"]/li[1]', 'name': u'\u6807\u9898'}

<type 'dict'>   
{'xpath': '//ul[@id="i-detail"]/li[1]', 'name': u'\u6807\u9898'}
Exception in thread Thread-3:
Traceback (most recent call last):
  File     "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/threading.py", line     522, in __bootstrap_inner
    self.run()
  File     "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/threading.py", line     477, in run
    self.__target(*self.__args, **self.__kwargs)
  File "fetcher.py", line 78, in run
    self.extractContent(html)
  File "fetcher.py", line 151, in extractContent
    m = tree.xpath(c['xpath'])
AttributeError: 'NoneType' object has no attribute 'xpath'

<type 'dict'>
{'xpath': '//ul[@id="i-detail"]/li[1]', 'name': u'\u6807\u9898'}

Here’s a piece of my code:

for c in self.contents:
  print type(c)
  print c
  m = tree.xpath(c['xpath'])

Please help me with these two questions:

Why the type is dict but the error says NoneType ?
I’m tring to match something in the “tree”, but it doesn’t work (The website is encoded under GBK, could the encoding type cause this kind of problems ?).

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-07T09:04:26+00:00

You are getting an AttributeError, which means that tree has no xpath attribute as it has become None, not that c has no xpath key, that’d be a KeyError instead.

Clearly we are missing some code here, where tree is set to `None.
You are not printing the result of your tree.xpath() calls, so there is nothing in your code (as shared with us here) that prints m. The tree.xpath() calls could be working fine for all we know.

Reading between the lines and speculating a little, you are assigning the result of tree.xpath() back to tree, and your XPath expression didn’t match anything and returned None. The next time into the loop, you now have None instead of an ElementTreeNode, so the xpath() call fails with an AttributeError.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m using python to write a crawler, since I need to parse html so

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply