Can someone check whether the code below is correct?
The code is found at
http://readthedocs.org/docs/scrapy/en/0.14/topics/exporters.html
I believe it is incorrect because:
- The class keeps track of multiple simultaneously open files for multiple spiders, however:
- The exporter (which depends on the file) is overwritten each time a new spider is opened.
Thanks for any assistance.
class XmlExportPipeline(object):
def __init__(self):
dispatcher.connect(self.spider_opened, signals.spider_opened)
dispatcher.connect(self.spider_closed, signals.spider_closed)
self.files = {}
def spider_opened(self, spider):
file = open('%s_products.xml' % spider.name, 'w+b')
self.files[spider] = file
self.exporter = XmlItemExporter(file)
self.exporter.start_exporting()
def spider_closed(self, spider):
self.exporter.finish_exporting()
file = self.files.pop(spider)
file.close()
def process_item(self, item, spider):
self.exporter.export_item(item)
return item
I think this question should be asked in scrapy-users group.
AFAIK, since v0.14 Scrapy doesn’t suppport multiple spiders in one process (related discussion), so this code will work fine. And obvious fix for multiple spiders is to make
exportersdict withspiderkeys: