I’m writing a simple java program, that does a simple task : it takes in input a text files folder, and it returns as output the 5 words with highest frequency per document.
At first, I tried to do it without any database support, but when I started having memory problems, I decided to change approach and configured the program to run with SQLite.
Everything works just fine now, but it takes a lot of time to just add the words in the database ( 67 seconds for 801 words).
Here is how I initiate the database :
this.Execute(
"CREATE TABLE words ("+
"word VARCHAR(20)"+
");"
);
this.Execute(
"CREATE UNIQUE INDEX wordindex ON words (word);"
);
then, once the programs has counted the documents in the folder ( let’s say N), I add N counter columns and N frequency columns to the table
for(int i = 0; i < fileList.size(); i++)
{
db.Execute("ALTER TABLE words ADD doc"+i+" INTEGER");
db.Execute("ALTER TABLE words ADD freq"+i+" DOUBLE");
}
At last, I add words using the following funcion:
public void AddWord(String word, int docid)
{
String query = "UPDATE words SET doc"+docid+"=doc"+docid+"+1 WHERE word='"+word+"'";
int rows = this.ExecuteUpdate(query);
if( rows <= 0)
{
query = "INSERT INTO words (word,doc"+docid+") VALUES ('"+word+"',1)";
this.ExecuteUpdate(query);
}
}
Am i doing something wrong, or it’s normal for an update query to take this long to execute?
Wrap all commands inside one transaction, otherwise you get one transaction (with the associated storage synchronizatrion) per command.