I would like to know how can I start a crawler based on Scrapy. I installed the tool via apt-get install and I tried to run an example:
/usr/share/doc/scrapy/examples/googledir/googledir$ scrapy list directory.google.com /usr/share/doc/scrapy/examples/googledir/googledir$ scrapy crawl
I hacked the code from spiders/google_directory.py but it seems that it is not executed, because I don’t see any prints that I inserted. I read their documentation, but I found nothing related to this; do you have any ideas?
Also, if you think that for crawling a website I should use other tools, please let me know. I’m not experienced with Python tools and Python is a must.
Thanks!
You missed the spider name in the crawl command. Use:
Also, I suggest you copy the example project to your home, instead of working in the
/usr/share/doc/scrapy/examples/directory, so you can modify it and play with it: