I have several servers running their own instance of a particular MySQL database which unfortunately cannot be setup in replication/cluster. Each server inserts data into several user-related tables which have foreign key constraints between them (e.g. user, user_vote). Here is how the process goes about:
- all the servers start with the same data
- each server grows its own set of data indepedently from the other servers
- periodically, the data from all the servers is merged manually together and applied back to each server (the process therefore repeats itself from step 1).
This is made possible because in addition to its primary key, the user table contains a unique email field which allows identifying which users are already existing in each database, and merging those who are new while changing the primary and foreign keys to avoid collisions and maintain the correct foreign key constraints. It works, but it’s quite some effort because primary and foreign keys have to be changed to avoid collision, hence my question:
Is there a way to have each server use primary keys that don’t collide with other servers to facilitate the merging?
I initially wanted to use a composite primary key (e.g. server_id, id) but I am using Doctrine which doesn’t support primary keys composed of multiple foreign keys so I would have problems with my foreign key constraints.
I thought about using a VARCHAR as an id and using part of the string as a prefix (SERVER1-1,SERVER1-2, SERVER2-1, SERVER2-2…) but I’m thinking it will make the DB slower as I will have to do some manipulations with the ids (e.g. on insert, I have to parse existing ids and extract highest, increment it, concatenate it with server id…).
PS: Another option would be to implement replication with read from slaves and write to master but this option was discarded because of issues such as replication lag and single point of failure on the master which can’t be solved for now.
You can make sure each server uses a different incrementation of autoincrement, and a different start offset:
Change the step auto_increment fields increment by
(assuming you are using auoincrements)
I’ve only ever used this across two servers, so my set-up had one with even ids and one with odd.
When they are merged back together nothing will collide, as long as you make sure all tables follow the above idea.
in order to implement for 4 servers
You would say, set-up the following offsets:
You would set your incrementation as such (I’ve used 10 to leave space for extra servers):
And then after you have merged, before copying back to each server, you would just need to update the autoinc value for each table to have the correct offset again. Imagine each server had created 100 rows, autoincs would be:
This is where it does get tricky due to having four servers. For imagine certain tables may not have had any rows inserted from a particular server. So you could end up with some tables having their last autoinc id not being from server 4, but from being from server 2 instead. This would make it very tricky to work out what the next autoinc should be for any particular table.
For this reason it is probably best to also include a column in each of your tables that records the server number when any rows are inserted.
That way you can easily find out what the last autoinc value should be for a particular server by selecting the following on any of your tables:
Using this value you can reset the next autoinc value you need for each table on each server, before rolling the merged dataset out to the server in question.
Where
sis the server number andnis set to the offset, so in my example it would be10.