I read about Voltdb’s command log. The command log records the transaction invocations instead of each row change as in a write-ahead log. By recording only the invocation, the command logs are kept to a bare minimum, limiting the impact the disk I/O will have on performance.
Can anyone explain the database theory behind why Voltdb uses a command log and why the standard SQL databases such as Postgres, MySQL, SQLServer, Oracle use a write-ahead log?
I think it is better to rephrase:
Let’s do an experiment and imagine you are going to write your own storage/database implementation. Undoubtedly you are advanced enough to abstract a file system and use block storage along with some additional optimizations.
Some basic terminology:
So your database may look like the following:
Next step is to execute some command:
Please note several important aspects:
Some intermediate states can be skipped, because it is enough to have a chain of commands instead.
Finally, you need to guarantee data integrity.
There are Pros and Cons for both approaches. Write-Ahead log contains all changed data, Command log will require addition processing, but fast and lightweight.
VoltDB: Command Logging and Recovery
Additional notes
SQLite: Write-Ahead Logging
PostgreSQL: Write-Ahead Logging (WAL)
Conclusion
Command Logging:
Write Ahead Logging is a technique to provide atomicity. Better Command Logging performance should also improve transaction processing. Databases on 1 Foot
Confirmation
VoltDB Blog: Intro to VoltDB Command Logging
It is important to remember VoltDB is distributed by its nature. So transactions are a little bit tricky to handle and performance impact is noticeable.
VoltDB Blog: VoltDB’s New Command Logging Feature