I have the following table structure
term_id int(11)
concept_id int(11)
idf double
And I have arround 2.4 milions records. When I perform the following query
stmt = con.prepareStatement("SELECT term, idf.idf FROM idf "
+ "JOIN term ON term.id=idf.term_id "
+ "WHERE concept_id= ?");
stmt.setInt(1, id);
res = stmt.executeQuery();
It takes about 3 seconds for a query. I am using MySQL Connector and I am performing this query from java code/eclipse.
Could I improve this speed ?
If a try performing a query directly in phpMyAdmin I have the following time results:
Records 18,869 total Time : 0.4204 sec
SELECT term, idf.idf
FROM idf
JOIN term ON term.id = idf.term_id
WHERE concept_id = 4
The problem is in this line:
You are trying to push 18,869 records from the remote host to your local client.
Of course you don’t need 18k records, who was the time to read all that?
So you probably only need the top few rows.
Or you need to do further processing.
Get top 100 rows
Store the data in a memory table for further processing
And than do a further select from mem_idfterm,
or use the (now infamous) select as a subselect to extract more data from.
Try to never push more than 1000 rows across the network.
Finally you can get the data in batches:
etc.
I recommend using an
order byclause when using limit. And if you must have consistent results, do all the piecemeal selects in a single transaction.