We have a database with a very simple schema:
CREATE TABLE IF NOT EXISTS tblIndex(
frame_type INT,
pts VARCHAR(5),
ts_start INT primary key,
ts_end INT)
And the application scenario is :
-
Every second user will insert 2 ~ 50 records ,and the
ts_startfields of those records is always increasing. After 8 hours ,there are at most 1_800_000 records. By setting the sync mode to off, the performance seems OK so far.And because the data of each record has only 16 bytes, we may use some buffer even if the insert speed is not too fast. -
After 8 hours , user will tell me to delete the oldest data by telling the upper bound ts_start , so I will do
DELETE FROM tblIndex WHERE ts_start < upper_bound_ts_start.Delete 90_000 (which is the records for half a hour) out of the 1_800_000 now take 17 seconds. A litter bit longer than expected. Any way to reduce this ? We don’t care if the records is synced to the hard disk immediately. We are thinking start a separated thread to do the delete so to make this call to be async. The things I am not sure is whether the (long time )delete will impact the insert performance if they share the same connection? Or should I use a separated connection for insert and delete? But In this way, do they need be synced in application level?
-
Search.
SELECT ts_start FROM tblIndex WHERE ts_start BETWEEN ? AND ?– As ts_start is the primary key so the performance is OK for our need now. I am thinking i should use a separated connection for search , right?
Configuration of SQLite :
hard disk database (usb interface)
cache size is 2000
page size is 1024
sync mode is 0 (off)
journal_mode mode is truncate
Thanks for any suggestion to improve the delete performance or about the overall design.
EDIT: 350M MIPS CPU , not too much memory (<2M) specific for this application.
Since the data is transient — and small — why are you using a database?
You’d be much happier with a simple directory of flat files.
Your web query can simply read the relevant set of files and return the required results. 1_800_000 records of 16 bytes is just 28Mb of file data. You can read the whole thing into memory, do your processing in memory, and present the results.
A separate process can delete files that are old once a day at midnight.
A third process can append 2-50 16-byte records to the working file each second.
Write and flush so that the file is correct and complete after each I/O. If your reader handles an incomplete last record gracefully, you don’t even need a lock.
Name each file with a sequence number based on the time. You could, for example, take the system time (in seconds) divide by 4*60*60 and truncate the answer. That’s a sequence number that will advance once every 4 hours, creating a new file. 8 hours of data is 3 of these files (2 previous 4-hour files, plus the current working file.)