I have the following table:
CREATE TABLE notes (noteId INTEGER PRIMARY KEY ASC, note, note_length, count, unique(note) on conflict abort)
It contains 3 million rows.
I then execute the following command:
def getDistintNoteCountList(note_length):
with sqlite3.connect(r'./note_database') as connection:
cursor = connection.cursor()
cursor.execute('select distinct count from notes where note_length = ?', [note_length])
return [i[0] for i in cursor]
However, it takes 30 seconds for this function to execute, where the returned list has a size of around 20. Is this reasonable considering that I have 3 million records or have I done something wrong?
Thanks,
Barry
EDIT
Added:
cursor.execute("create index countIndex on notes (count)")
cursor.commit()
And reloaded the data into the database. It still seems to be just as slow.
Since the query has a where clause involving note_length, and requires the count field, the optimal index would be (note_length,count) in that order. This is a covering index btw, but I’m not sure sqlite is able to exploit it in this situation.
sqlite query planning is explained in this page