Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8852481
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 14, 20262026-06-14T13:24:44+00:00 2026-06-14T13:24:44+00:00

I am working on scrapy, i have created two spider files with two different

  • 0

I am working on scrapy, i have created two spider files with two different urls in a single scrapy project.

And the two spiders scraping perfectly when runned individually. Actually the problem is each url has different items to fetch so declared all items in items.py file.
Here after scraping i am storing the data in to csv file created dynamically with the name of spider.

So for example when i need to run spider1, i need to declare separate process_item method, as items are different for both spiders and when i need to run second spider i need to write another process_item by commenting other method.
Whether there is any way in scrapy to use two process_item method? and below is my pipeline.py code

pipeline.py

from w3c_browser.items import WCBrowserItem
import csv
from csv import DictWriter
from cStringIO import StringIO
from datetime import datetime
class W3CBrowserPipeline(object):
    def __init__(self):
        dispatcher.connect(self.spider_opened, signal=signals.spider_opened)
        dispatcher.connect(self.spider_closed, signal=signals.spider_closed)
        self.brandCategoryCsv = csv.writer(open('wcbbrowser.csv', 'wb'))

    def spider_opened(self, spider):
        spider.started_on = datetime.now()
        if spider.name == 'browser_statistics':
            log.msg("opened spider  %s at time %s" % (spider.name,datetime.now().strftime('%H-%M-%S')))
            self.brandCategoryCsv = csv.writer(open("csv/%s-%s.csv"% (spider.name,datetime.now().strftime('%d%m%y')), "wb"),
                       delimiter=',', quoting=csv.QUOTE_MINIMAL)
        elif spider.name == 'browser_os':
            log.msg("opened spider  %s at time %s" % (spider.name,datetime.now().strftime('%H-%M-%S')))
            self.brandCategoryCsv = csv.writer(open("csv/%s-%s.csv"% (spider.name,datetime.now().strftime('%d%m%y')), "wb"),
                       delimiter=',', quoting=csv.QUOTE_MINIMAL)

    def process_item(self, item, spider):
        self.brandCategoryCsv.writerow([item['year'],
                                        item['internet_explorer'],
                                        item['firefox'],
                                        item['chrome'],
                                        item['safari'],
                                        item['opera'],


        ])
        return item
# For Browser Os
#    def process_item(self, item, spider):
#        self.brandCategoryCsv.writerow([item['year'],
#                                        item['vista'],
#                                        item['nt'],
#                                        item['winxp'],
#                                        item['linux'],
#                                        item['mac'],
#                                        item['mobile'],
#                                        
#
#        ])
#        return item

    def spider_closed(self, spider):
        log.msg("closed spider %s at %s" % (spider.name,datetime.now().strftime('%H-%M-%S')))
        work_time = datetime.now() - spider.started_on
        print str(work_time),"Total Time taken by the spider to run>>>>>>>>>>>"

Here in the above code as you observe, when i run the spider with the name browser_statistics it will create a csv file with browser_statistics-date format and writes the data from item in to csv file

But when i want to run the second spider with name browser_os , process_item method doesn’t work because both spiders are having different items to fetch

Can anyone please let me know

Is there anyway to run more than one spiders with same process_item with different items ?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-14T13:24:45+00:00Added an answer on June 14, 2026 at 1:24 pm

    best option is you can use IF ELSE based of spider name in process_item

    like

    def process_item(self, item, spider):
        if 'spider1' in spider.name:
            #TODO write CSV for spider1
        else:
            #Should be a spider2
            #TODO write CSV for spider2
        return item
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am working on scrapy, i am scheduling a spider i had wrote with
I have been trying to get a simple spider to run with scrapy, but
I am working with the web scraping framework Scrapy and I am wondering how
i have multiple spiders in one project , problem is right now i am
Hi I am working on scraping the XML file. For HTML I have used
I am working on a little project of mine and have built a UDP
Hi i am working on scrapy Below is my code class examplespider(CrawlSpider): name =
I am working on scrapy , i am trying to gather some data from
Working with Reporting Services 2008 r2. So here's my issue: We have 5 reports
Working on the problems on Project Euler to try to learn Clojure. I'm on

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.