I’ve decided to use the Python logging module because the messages generated by Twisted on std error is too long, and I want to INFO level meaningful messages such as those generated by the StatsCollector to be written on a separate log file while maintaining the on screen messages.
from twisted.python import log
import logging
logging.basicConfig(level=logging.INFO, filemode='w', filename='buyerlog.txt')
observer = log.PythonLoggingObserver()
observer.start()
Well, this is fine, I’ve got my messages, but the downside is that I do not know the messages are generated by which spider! This is my log file, with “twisted” being displayed by %(name)s:
INFO:twisted:Log opened.
2 INFO:twisted:Scrapy 0.12.0.2543 started (bot: property)
3 INFO:twisted:scrapy.telnet.TelnetConsole starting on 6023
4 INFO:twisted:scrapy.webservice.WebService starting on 6080
5 INFO:twisted:Spider opened
6 INFO:twisted:Spider opened
7 INFO:twisted:Received SIGINT, shutting down gracefully. Send again to force unclean shutdown
8 INFO:twisted:Closing spider (shutdown)
9 INFO:twisted:Closing spider (shutdown)
10 INFO:twisted:Dumping spider stats:
11 {'downloader/exception_count': 3,
12 'downloader/exception_type_count/scrapy.exceptions.IgnoreRequest': 3,
13 'downloader/request_bytes': 9973,
As compared to the messages generated from twisted on standard error:
2011-12-16 17:34:56+0800 [expats] DEBUG: number of rules: 4
2011-12-16 17:34:56+0800 [scrapy] DEBUG: Telnet console listening on 0.0.0.0:6023
2011-12-16 17:34:56+0800 [scrapy] DEBUG: Web service listening on 0.0.0.0:6080
2011-12-16 17:34:56+0800 [iproperty] INFO: Spider opened
2011-12-16 17:34:56+0800 [iproperty] DEBUG: Redirecting (301) to <GET http://www.iproperty.com.sg/> from <GET http://iproperty.com.sg>
2011-12-16 17:34:57+0800 [iproperty] DEBUG: Crawled (200) <
I’ve tried %(name)s, %(module)s amongst others but I don’t seem to be able to show the spider name. Does anyone knows the answer?
EDIT:
the problem with using LOG_FILE and LOG_LEVEL in settings is that the lower level messages will not be shown on std error.
You want to use the
ScrapyFileLogObserver.I’m glad you asked this question, I’ve been wanting to do this myself.