Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6842555
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 27, 20262026-05-27T00:06:52+00:00 2026-05-27T00:06:52+00:00

I have a Scrapy project and I am trying to save the output items

  • 0

I have a Scrapy project and I am trying to save the output items as an object from a Django model definition (I am not using DjangoItem).

I am importing Django settings as specified here.

def setup_django_env(path):
    import imp, os
    from django.core.management import setup_environ

    f, filename, desc = imp.find_module('settings', [path])
    project = imp.load_module('settings', f, filename, desc)       

    setup_environ(project)

setup_django_env(PATH_TO_DJANGO_PROJECT)

In my Scrapy project I have a pipeline class that processes all the items at the end and saves it to the DB:

from my_django_project.apps.my_books.models import Book, Category, Image

class DjangoPipeline(object):

    def process_item(self, item, spider):
        category = Category.objects.get(name='Horror')
        book = Book(name='something', category=category)
        book.save()
        image = Image(name='something', book=book)
        image.save()
        return item

However, something weird happens and for the first item I get an error (see below). For the rest of the items everything is fine. Let’s say I have 7 items to save, so I get an error in the first one and the other 6 get saved.

Traceback (most recent call last):
  File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/scrapy/middleware.py", line 54, in _process_chain
    return process_chain(self.methods[methodname], obj, *args)
  File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/scrapy/utils/defer.py", line 65, in process_chain
    d.callback(input)
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python/twisted/internet/defer.py", line 243, in callback
    self._startRunCallbacks(result)
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python/twisted/internet/defer.py", line 312, in _startRunCallbacks
    self._runCallbacks()
--- <exception caught here> ---
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python/twisted/internet/defer.py", line 328, in _runCallbacks
    self.result = callback(self.result, *args, **kw)
  File "/users/ale/djcode/books/lib/scraper/scraper/djangopipeline.py", line 34, in process_item
    selected_category = Category.objects.get(name='Horror')
  File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/django/db/models/manager.py", line 132, in get
    return self.get_query_set().get(*args, **kwargs)
  File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/django/db/models/query.py", line 333, in get
    clone = self.filter(*args, **kwargs)
  File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/django/db/models/query.py", line 550, in filter
    return self._filter_or_exclude(False, *args, **kwargs)
  File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/django/db/models/query.py", line 568, in _filter_or_exclude
    clone.query.add_q(Q(*args, **kwargs))
  File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/django/db/models/sql/query.py", line 1131, in add_q
    can_reuse=used_aliases)
  File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/django/db/models/sql/query.py", line 1026, in add_filter
    negate=negate, process_extras=process_extras)
  File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/django/db/models/sql/query.py", line 1182, in setup_joins
    field, model, direct, m2m = opts.get_field_by_name(name)
  File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/django/db/models/options.py", line 291, in get_field_by_name
    cache = self.init_name_map()
  File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/django/db/models/options.py", line 321, in init_name_map
    for f, model in self.get_all_related_m2m_objects_with_model():
  File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/django/db/models/options.py", line 396, in get_all_related_m2m_objects_with_model
    cache = self._fill_related_many_to_many_cache()
  File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/django/db/models/options.py", line 410, in _fill_related_many_to_many_cache
    for klass in get_models():
  File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/django/db/models/loading.py", line 167, in get_models
    self._populate()
  File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/django/db/models/loading.py", line 61, in _populate
    self.load_app(app_name, True)
  File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/django/db/models/loading.py", line 76, in load_app
    app_module = import_module(app_name)
  File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/django/utils/importlib.py", line 35, in import_module
    __import__(name)
exceptions.ImportError: No module named my_books

If I do something like this, all 7 items get saved:

from my_django_project.apps.my_app.models import Book, Category, Image

class DjangoPipeline(object):

    def process_item(self, item, spider):
        try:
            category = Category.objects.get(name='something')
        except:
            category = Category.objects.get(name='something')
        book = Book(name='something', category=category)
        try:
            book.save()
        except:
            book.save()
        image = Image(name='something', book=book)
        try:
            image.save()
        except:
            image.save()
        return item

I don’t know what I am doing wrong. Could someone help me, please?

Thanks!

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-27T00:06:53+00:00Added an answer on May 27, 2026 at 12:06 am

    I had the same problem and I found a solution. At least, it worked for me.

    In my case the problem was in Django project’s setting.py file – I added not the FQN (fully qualified name) of the my app to the INSTALLED_APPS tuple, but it’s short name.

    Talking about your example, it may be that you added to the INSTALLED_APPS the my_books element, but not the my_django_project.apps.my_books.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

During a parsing process using scrapy I have found this output [u'TARTARINI AUTO SPA
I did scraping text from webpage using scrapy. In spider, I have code like:
I am using scrapy to crawl different sites, for each site I have an
I have virtualenv with --no-site-packages option. I'm using scrapy in it. Scrapy uses libxml2
I have a project where I am trying to login to sprint and then
I have a web service with Django Framework. My friend's project is a WIN32
I have scraped a webpage using Scrapy and need to extract the background color
I am working with a node.js project (using Wikistream as a basis, so not
(I have asked this question on the Scrapy google-group without luck.) I am trying
'm trying to crawl some rows from CSV file using CSVFeedSpider The structure of

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.