I have a more or less good working query (concerning to the result) but it takes about 45seconds to be processed. That’s definitely too long for presenting the data in a GUI.
So my demand is to find a much faster/efficient query (something around a few milliseconds would be nice)
My data table has something around 3000 ~2,619,395 entries and is still growing.
Schema:
num | station | fetchDate | exportValue | error
1 | PS1 | 2010-10-01 07:05:17 | 300 | 0
2 | PS2 | 2010-10-01 07:05:19 | 297 | 0
923 | PS1 | 2011-11-13 14:45:47 | 82771 | 0
Explanation
- the exportValue is always incrementing
- the exportValue represents the actual absolute value
- in my case there are 10 stations
- every ~15 minutes 10 new entries are written to the table
- error is just an indicator for a proper working station
Working query:
select
YEAR(fetchDate), station, Max(exportValue)-MIN(exportValue)
from
registros
where
exportValue > 0 and error = 0
group
by station, YEAR(fetchDate)
order
by YEAR(fetchDate), station
Output:
Year | station | Max-Min
2008 | PS1 | 24012
2008 | PS2 | 23709
2009 | PS1 | 28102
2009 | PS2 | 25098
My thoughts on it:
- writing several queries with between statements like ‘between 2008-01-01 and 2008-01-02′ to fetch the MIN(exportValue) and between 2008-12-30 and 2008-12-31’ to grab the MAX(exportValue) – Problem: a lot of queries and the problem with having no data in a specified time range (it’s not guaranteed that there will be data)
- limiting the resultset to my 10 stations only with using order by MIN(fetchDate) – problem: takes also a long time to process the query
Additional Info:
I’m using the query in a JAVA Application. That means, it would be possible to do some post-processing on the resultset if necessary. (JPA 2.0)
Any help/approaches/ideas are very appreciated. Thanks in advance.
Adding suitable indexes will help.
2 compound indexes will speed things up significantly: