I have a 15gb SQLite db with 40 columns. I would like to remove most of the columns to make queries faster and the db more portable. I found this guidance, but can’t get it to work. It seems like it hangs and I end up corrupting the db and having to start over. Here’s the script I use:
BEGIN TRANSACTION;
CREATE TEMPORARY TABLE dly_backup(DATE INTEGER,
TICKER TEXT,
TSYMBOL TEXT,
VOL INTEGER,
RET REAL,
RETX REAL,
VWRETD REAL,
VWRETX REAL,
EWRETD REAL,
EWRETX REAL,
SPRTRN REAL
);
INSERT INTO dly_backup SELECT DATE INTEGER,
TICKER TEXT,
TSYMBOL TEXT,
VOL INTEGER,
RET REAL,
RETX REAL,
vwretd REAL,
vwretx REAL,
ewretd REAL,
ewretx REAL,
sprtrn REAL
FROM dly;
DROP TABLE dly;
CREATE TABLE dly(DATE INTEGER,
TICKER TEXT,
TSYMBOL TEXT,
VOL INTEGER,
RET REAL,
RETX REAL,
VWRETD REAL,
VWRETX REAL,
EWRETD REAL,
EWRETX REAL,
SPRTRN REAL
);
INSERT INTO dly SELECT DATE INTEGER,
TICKER TEXT,
TSYMBOL TEXT,
VOL INTEGER,
RET REAL,
RETX REAL,
VWRETD REAL,
VWRETX REAL,
EWRETD REAL,
EWRETX REAL,
SPRTRN REAL
FROM dly_backup;
DROP TABLE dly_backup;
COMMIT;
Is there a better way to do this? FWIW, I have the original .csv file and import it using the RSQLite package in R.
Is there a way that I can only import a subset of columns in a .csv file? Thanks! (new to SQLite)
I actually have no idea whether 15G is a large size for SQLite – I tend to use DMBS’ where 15G can be considered a configuration table 🙂
However, one thing we normally do for this sort of work, which may help you out:
*a Of course, we don’t really do this, we have multiple redundant database instances with failover and all sorts of other wonderful features like replication, but it’s probably workable for a small database like this.
One thing I have noticed is that you copy the partial rows to the backup table, re-create the original table, then copy them back row by row, before deleting the backup.
It seems to me that you could simply rename the current table to the backup one instead of that first copy. It doesn’t matter that you still have the unneeded columns in the backup since you’re not going to transfer them and will eventually delete the backup table. Give this a try (and minimising your transaction scope):
This will result in a transaction that half the size of the one you’re attempting.
Then, and only then, and only if there were no errors, would you
drop dly_backup. If you still had problems with the process, you woulddrop dlythen rename the backup table back to the original and try again.Another thing you may want to try is to limit the data being transferred in a test run, to see if it runs okay with a smaller data set. Using your original code, try to create the
dly_backuptable but only copy across a subset of data (assuming these are NYSE/NASDAQ quotes, you can do something like using awhereclause to only get one ticker symbol such asMSFTorIBM).Don’t drop any tables in the test run.
And I’ve just noticed your rather strange syntax for the
insert...selectstatement where you have both the column name and type. I don’t know whether that’s an extension for SQLite but I think it would cause problems for other DBMS’ that I’m familiar with. Was that a typo on your part?