We’ve got an Oracle 11g installation that is starting to get big. This database is the backend to a parallel optimization system running on a cluster. Input to the process is contained in the database along with output from the optimization steps. The input includes rote configuration data and some binary files (using 11g’s SecureFiles). The output includes 1D, 2D, 3D, and 4D data currently stored in the DB.
DB Structure:
/* Metadata tables */
Case(CaseId, DeleteFlag, ...) On Delete Cascade CaseId
OptimizationRun(OptId, CaseId, ...) On Delete Cascade OptId
OptimizationStep(StepId, OptId, ...) On Delete Cascade StepId
/* Data tables */
Files(FileId, CaseId, Blob) /* deletes are near instantateous here */
/* Data per run */
OnedDataX(OptId, ...)
TwoDDataY1(OptId, ...) /* packed representation of a 1D slice */
/* Data not only per run, but per step */
TwoDDataY2(StepId, ...) /* packed representation of a 1D slice */
ThreeDDataZ(StepId, ...) /* packed representation of a 2D slice */
FourDDataZ(StepId, ...) /* packed representation of a 3D slice */
/* ... About 10 or so of these tables exist */
A reaper script comes around daily and looks for cases with the DeleteFlag = 1 and proceeds with the DELETE FROM Case WHERE DeleteFlag = 1, allowing the cascades to continue.
This strategy works great for read/write, but is now outstripping our capabilities when we want to purge data! The rub is deleting a Case takes ~20-40 minutes depending on the size and often overloads our archiver space. The next major version of the product will take a “from the ground up” approach to solving the problem. The next minor release needs to stay within the confines of data stored in the database.
So, for the minor release we need an approach that can improve delete performance and at most require moderate changes to the database.
- REF Partitioning, but the question is HOW? I would love to do INTERVAL on
Caseand REF on the rest, but that isn’t supported. Is there some way to manually partitionOptimizationRunbyCaseIdthrough a trigger? - Disable archiving/redo logs for deletes? Couldn’t find a HINT to go with this one. Not sure it is even feasible.
Truncate? This likely would need some sorta complicated table setup. But maybe I’m not considering all of my option.(per answer, stricken)
To help illustrate the issue, the data in question per case ranges from 15MiB to 1.5GiB with anywhere from 20k to 2M rows.
Update: Current size of the DB is ~1.5TB.
Deleting data is a hell of a job, for the database. It has to create before images, update indexes, write redo logs and remove the data. This is a slow process. If you can have a window to perform this task, easiest and fastest is to build new tables, containing the wanted data. Drop the old tables and rename the new tables.
This requires some setup work, that is obvious but is very well possible to make.
One step less drastic is to drop the indexes before the delete takes place. My vote would go for CTAS (Create Table As Select from) and build the new tables.
A nice partitioning schema would certainly be helpful, maybe in the next release Oracle can combine interval and reference partitioning. It would be very nice to have.
Disabling logging …. can not be done for deletes but CTAS can use nologging. Make a backup when ready and make sure to transfer the datafiles to the standby database, if you have one.