Here is model of my table structure. Three tables.
---------------- ---------------------------- -------------------------------
|possibilities | |realities | |measurements |
|--------------| |--------------------------| |-----------------------------|
|pid| category | |rid | pid | item | status | |mid | rid | meas | date |
|--------------| |--------------------------| |-----------------------------|
|1 | animal | |1 | 1 | dog | 1 (yes)| |1 | 1 | 3 | 2012-01-01|
|2 | vegetable| |2 | 1 | fox | 1 | |2 | 3 | 2 | 2012-01-05|
|3 | mineral | |3 | 1 | cat | 1 | |3 | 1 | 13 | 2012-02-02|
---------------- |4 | 2 | apple| 2 (no) | |4 | 3 | 24 | 2012-02-15|
|5 | 1 | mouse| 1 | |5 | 2 | 5 | 2012-02-16|
|7 | 1 | bat | 2 | |6 | 6 | 4 | 2012-02-17|
---------------------------- -------------------------------
What I’m after is a result that will show me a series of counts based on measurement ranges for a particular entry from the “possibilities” table where the status of the related “realities” is 1 (meaning it’s currently being tracked), BUT the only measurement that is relevant is the most recent one.
Here is an example result I’m looking for using animal as the possibility.
-----------------------
| 0-9 | 10-19 | 20-29 |
|---------------------|
| 2 | 1 | 1 |
-----------------------
So, in this example, the apple row was not used to make a count because it isn’t an animal, nor was bat because it’s status was set to no (meaning don’t measure), and only the most recent measurements were used to determine the count.
I currently have a workaround in my real world use but it doesn’t follow good database normalization. In my realities table I have a current_meas column that get updated when a new measurement is taken and entered in the measurements table. Then I only need to use the first two tables and I have a single SELECT statement with a bunch of embedded SUM statements that use an IF the value is between 0-9 for example. It gives me exactly what I want, however my app has evolved to the point where this convenience has become a problem in other areas.
So, the question is, is there a more elegant way to do this in one statement? Subselects? temporary tables? Getting the counts is the heart of what the app is about.
This is a PHP5, MySQL5, JQuery 1.8 based webapp, in case that give me some more options. Thanks in advance. I love the stack and hope to help back as much as it has helped me.
Here is what I ended up doing based on the two answers suggested.
realities that are based on one possibility (animals) and whose
status is 1 (yes).
individual realities from the first temp table, and finds the most
recent measurement for each one.
counts in ranges.
When I tried it with just one temp table the query would take 5-10 seconds per possibility. In my real-world use I currently have 30 possibilities (a script loops through each one and generates these temp tables and selects), well over 1,000 realities (600 active on any given day, 100 added per month), and over 21,000 measurements (20-30 added daily). That just wasn’t working for me. So breaking it up into smaller pools to draw from reduced it to the whole report running in under 3-4 seconds.
Here is the MySQL stuff with my real-world table and column names.
UPDATE
I got it to work in a single SELECT statement without using temporary tables and it works faster. Using my sample schema above, here is how it looks.