I have two database tables, one in MYSQL and one in MSSQL. Both have similar data, and one is based on data from another. They’re in two different databases because one is a remote system that is administered and the local system is a Drupal installation which I’m using to show the data in a more friendly manner through a custom module.
For example, I have a table of this sort of structure in MSSQL:
ID | Title | Description | Other fields I don't care about
And based on pulling data from this table I generate a table in MYSQL:
local_id | remote_id | title | description
When the module is initialized, it goes out and does a select from the MSSQL table and generates records and populates the local database. Remote_id is the ID field in the MSSQL database so we can reference the two records together.
I need to sync up this data, deleting records locally which no longer exist on the remote table and creating new records which do not exist locally, and also update all rows information.
Problem is, that sort of requires at least 2 different transactions with possible by-row transactions as well. Example:
To sync local to Remote and remove non-existent remote records:
Select remote_id from local_table;
For Each remote_id ( select ID, title, description FROM remote_table where ID = remote_id )
If record exists
UPDATE local_table WHERE remote_id = row_id
Else
DELETE FROM local_table where remote_id = row_id
Then we need at least one other transaction to get new records (I could update here too if I didn’t do it in the previous loop):
Select ID, title, description from remote_table;
For each ID ( Select remote_id from local_table )
If does not exist
INSERT INTO local_table (VALUES)
So that’s a lot of db activity. It would be easier if the tables were of the same type but as it is that’s the only way I know how to do it. Is there a better way? Could I just pull both result sets into an associative array and compare that way and only do the transactions necessary to remove and create? I’m unsure.
there are a lot of ways to do this based on the system you house.
The first assumption i am making is that you have 2 databases and you want to sync data between these 2
that is MSSQL db must pull data from MySQL and vice versa
Your approach of using associative arrays is good but what if there are 100 columns in the table? ( in your case it is not but the approach is not future proof)
So to update 1 row you need to make “n” column comparisons if there are 100 rows, then there will be 100*n comparisons
Have a look at MySQL REPLACE, INSERT INTO .. ON DUPLICATE KEY clauses that might help you – i dont know if there are such clauses in MSSQL
You can do other things like – have a “last_updated” column in each database table – whenever a column in the table gets updated, this time-stamp field must be updated
This way you can tell if a row in either database table was updated ( by comparing it to your old timestamp value) and only update those rows
logic would be in these lines