I’m talking about a library that would allow me to log events from different machines and would align these events on a “global” time axis with sufficiently high precision.
Actually, I’m asking because I’ve written such a thing myself in the course of a cluster computing project, I found it terrifically useful, and I was surprised that I couldn’t find any analogues.
Therefore, the point is whether something like this exists (and I better contribute to it) or nothing exists (and I better write an open-source analogue of my solution).
Here are the features that I’d expect from such a library:
- Independence on the clock offset between different machines
- Timing precision on the order of at least milliseconds, preferably microseconds
- Scalability to thousands of concurrent logging processes, with at least several megabytes of aggregated logs per second
- Soft real-time operation (t.i. I don’t want to collect 200 big logs from 200 machines and then compute clock offsets and merge them – I want to see what happens “live”, perhaps with a small lag like 10s)
Facebook’s contribution in the matter is called ‘Scribe‘.
Excerpt:
…
The API is Thrift-based, so you have a good platform coverage, but in case you’re looking for simple integration for Java you may want to have a look at Digg’s log4j appender for Scribe.