I’m working on a large webapp, which (among other things) has a table which is built from an existing set of data online. This data-set may change (though probably not frequently) so our general plan is to update/rebuild every 1-2 months. This will all be happening though Python with SQLAlchemy
With little experience in dealing with both webapps and large databases, what’s the best way to do this? Building the database from scratch will take 5-6 hours, which, to be honest, is acceptable downtime (it’s a scientific analysis server). Of course, another option is to create a second table in parallel, then drop the original table and rename the new one, but does this have consistency issues? Is there a way to “live update” a table, or is the risk of a crash here not worth it (i.e. if you crash leave the table in an inconsistent state relative to the real data)?
Clearly there’s a trade off between simplicity, safety and lack of downtime, but I’m just interested in what my options are (to suss out those, “unknown unknowns”).
Here’s what I’d recommend:
Back up the original table using mysqldump
Build the new table under a different name
Back up the new table using mysqldump
Shut down your application
Drop the old table
Rename the new table to the original table name
Restart your application
This should yield minimal down-time, just the time that it takes to drop the old table and rename the old one. The backups give you safety in case you screw something up.