I’ve been working on a server and I’m starting to implement logging. However, I’m not sure whether I should use the db for logging, or just a plaintext file.
I’m planning on logging some basic information for every request (what type of request, ip address of request, session tracking). For some requests there will be extended information present (details on what type of request was made), and if there are any errors I will log those, too.
On the one hand, putting the logs into the db means I could run queries on the logged data. On the other hand, I’m not sure if this would be putting unnecessary strain on the db. Of course, I could also use both the db and a log file for logging. What are people’s thoughts on proper logging?
(If it makes a difference, I’m using mod_python on an Apache server with a MySQL db. So I’d either be using the logging library or just creating some logging tables in the db.)
First, use a logging library like SLF4J/Logback that allows you to make this decision dynamically. Then you can tweak a configuration file and route some or all of your log messages to each of several different destinations.
Be very careful before logging to your application database, you can easily overwhelm it if you’re logging a lot of stuff and volume starts to get high. And if your application is running close to full capacity or in a failure mode, the log messages may be inaccessible and you’ll be flying blind. Probably the only messages that should go to your application database are high-level application-oriented events (a type of application data).
It’s much better to “log to the file system” (which for a large production environment includes logging to a multicast address read by redundant log aggregation servers).
Log files can be read into special analytics databases where you could use eg, Hadoop to do map/reduce analyses of log data.