This is my spider code
class DmozSpider(BaseSpider):
5 name = "dmoz"
6 allowed_domains = ["dmoz.org"]
7 start_urls = [
8 "file:///home/ubuntu/xxx/test.html",
9 ]
10 def parse(self, response):
11 hxs = HtmlXPathSelector(response)
12 sites = hxs.select("//li")
13 items = []
14 for site in sites:
15
16 item = DmozItem()
17 item['title'] = site.select('a/text()').extract()
18 item['link'] = site.select('a/@href').extract()
19 item['desc'] = site.select('text()').extract()
20 items.append(item)
21 return items
Now i want to write data in log file like name: {{name}} , link={{link }} for tetsing , as it crawls the site live.
how can i do that
Here’s the answer, but I assume you just copied the code you already have, otherwise you’d know how to use file IO, or at least have the capability to research the topic which has been covered a million times on this site alone.