I’m working with a moderate sized SQL Server 2008 database (around 120 tables, backups are around 4GB compressed) where all the table primary keys are declared as simple int columns.
At present, primary key values are generated by NHibernate with the increment identity generator, which has worked well thus far, but precludes moving to a multiprocessing environment.
Load on the system is growing, so I’m evaluating the work required to allow the use of multiple servers accessing a common database backend.
Transitioning to the hi-lo generator seems to be the best way forward, but I can’t find a lot of detail about how such a migration would work.
Will NHibernate automatically create rows in the hi-lo table for me, or do I need to script these manually?
If NHibernate does insert rows automatically, does it properly take account of existing key values?
If NHibernate does take care of thing automatically, that’s great. If not, are there any tools to help?
Update
NHibernate’s increment identifier generator works entirely in-memory. It’s seeded by selecting the maximum value of used identifiers from the table, but from that point on allocates new values by a simple increment, without reference back to the underlying database table. If any other process adds rows to the table, you end up with primary key collisions. You can run multiple threads within the one process just fine, but you can’t run multiple processes.
For comparison, the NHibernate identity generator works by configuring the database tables with identity columns, putting control over primary key generation in the hands of the database. This works well, but compromises the unit of work pattern.
The hi-lo algorithm sits inbetween these – generation of primary keys is coordinated through the database, allowing for multiprocessing, but actual allocation can occur entirely in memory, avoiding problems with the unit of work pattern.
To use the hi-lo generator you will need to create the lookup table that will store the next value for the “Hi” part of the generated keys. You have the choice of creating a separate column for each entity table, a single column that will be used by all entities, or a combination of the two options.
If a shared column is used then each generated key will only be used by a single entity. This may be preferable if there are many entity tables, but it reduces the total number of Ids that can be generated.
For example, our project uses a
HiLoLookuptable with three columns:The log tables have a high-volume of inserts, so have been given a separate pool of Hi values. The primary key columns of our regular entity tables use the 64-bit
BIGINTdata type so there’s is no danger of overflowing even if there are large gaps in the sequence of ids. A shared pool of ids is used to reduce administration overhead.The hi-lo generator doesn’t have built-in support for initializing itself with starting values that don’t conflict with existing keys – so this will need to be performed manually.
The value to use as the starting “hi” value depends on several considerations:
max_lo) – a bigger value improves concurrency but increases the potential for ids to be wasted, especially if the service is restarted frequentlyThe
max_lovalue that is provided in your entity mappings is critical when determining what your starting ‘hi’ values should be. For example, consider a table with a maximum existing id value of 12345. The number of ids that should be generated before going back to the database is 1000. In this case, the starting hi value should be(12345 / 1000) + 1 = 13, the first generated id will be 13000. Due to a quirk in the HiLoGenerator implementation, themax_lovalue provided in the entity configuration needs to be 999, not 1000.If using
.hbmmappings: