I have two tables: TableA and TableB. Both have “dates” and “rates” fields. I want to have minimum rates of TableA and their dates; and maximum rates of TableB and their dates. Also, I like to list them for each month and year.
I use the query below to get minimum and maximum rates from one table. But I could not figure out how to get minimum rates from TableA maximum rates from TableB.
SELECT
MIN(rate) AS minRate,
(SELECT date FROM TableA WHERE rate = min(t2.rate) and month(date) = month(t2.date) and year(date) = year(t2.date) limit 1 ) as minDate,
MONTHNAME(date) as MN, YEAR(date) as YN,
MAX(rate) AS maxRate,
(SELECT date FROM TableAs WHERE rate = max(t2.rate) and month(date) = month(t2.date) and year(date) = year(t2.date) limit 1) as maxDate
FROM TableA t2
GROUP BY YEAR(date) , MONTH(date)";
EDIT 1: I ended up with this.
SELECT a.MinYear AS Year, a.MinMonth AS Month, a.MinRate, b.MaxRate, a.MinDate, b.MaxDate
FROM (SELECT YEAR(date) AS MinYear, MONTH(date) AS MinMonth, MIN(rate) AS MinRate,
(SELECT date FROM $TableA WHERE rate = MIN(t2.rate) AND YEAR(date) = YEAR(t2.date) AND MONTH(date) = MONTH(t2.date) limit 1) AS MinDate
FROM $TableA t2
GROUP BY MinYear, MinMonth
) AS a
JOIN (SELECT YEAR(date) AS MaxYear, MONTH(date) AS MaxMonth, MAX(rate) AS MaxRate,
(SELECT date FROM $TableB WHERE rate = MAX(t3.rate) AND YEAR(date) = YEAR(t3.date) AND MONTH(date) = MONTH(t3.date) limit 1) AS MaxDate
FROM $TableB t3
GROUP BY MaxYear, MaxMonth
) AS b
ON a.MinYear = b.MaxYear AND a.MinMonth = b.MaxMonth
ORDER BY Year, Month
EDIT 2
Jonathan Leffler’s query (with minor changes after testing) performs better:
SELECT a.MinYear AS Year, a.MinMonth AS Month, a.MinDate, a.MinRate, b.MaxDate, b.MaxRate
FROM (SELECT n.MinYear, n.MinMonth, a.Date AS MinDate, n.MinRate
FROM $TableA AS a
JOIN (SELECT YEAR(date) AS MinYear, MONTH(date) AS MinMonth, MIN(rate) AS MinRate
FROM $TableA
GROUP BY MinYear, MinMonth
) AS n
ON a.Rate = n.MinRate AND YEAR(a.Date) = n.MinYear AND MONTH(a.Date) = n.MinMonth
) AS a
JOIN (SELECT x.MaxYear, x.MaxMonth, b.Date AS MaxDate, x.MaxRate
FROM $TableB AS b
JOIN (SELECT YEAR(date) AS MaxYear, MONTH(date) AS MaxMonth, MAX(rate) AS MaxRate
FROM $TableB
GROUP BY MaxYear, MaxMonth
) AS x
ON b.Rate = x.MaxRate AND YEAR(b.Date) = x.MaxYear AND MONTH(b.Date) = x.MaxMonth
) AS b
ON a.MinYear = b.MaxYear AND a.MinMonth = b.MaxMonth
ORDER BY Year, Month";
Original answer
You need to create two result sets, one from tableA, one from TableB, and then join them. As with any complex SQL query, I build the result up in parts. First, we need the minimum rate for each month from TableA:
The analogous query for maximum rates from TableB is:
Now you need to join these two results on year and month columns:
Extension to manage missing data
If you have to worry about missing data from either TableA or TableB, then life is a bit more complex. You then really need a FULL OUTER JOIN, but some DBMS do not offer that. If you have to worry about some months being unrepresented in both tables, then you need to generate a table which specifies the dates (month and year) that you’re interested in, and then you can LEFT OUTER JOIN that with each of the two expressions above.
If need be, you can specify the range of dates you are interested in from the MonthYearTable.
Finding the dates when the extremum rates occurred
If, as suggested in the comments, the answer should include the exact date(s) within each month when the maximum or minimum rate occurred, then the ‘find the extremum’ sub-queries are more complex:
Similarly for the query against TableB:
Combining these leads to the query:
Note that if the same minimum rate is reported on three different days in a given month, this will have three lines of output for that month, one for each of those days. In fact, if there are also two days on which the maximum rate occurred, then there’ll be six lines of output for that month. If this is not what’s required, then you can do an appropriate aggregate (meaning MIN or MAX, most likely) on the dates within the month:
And then combine this expression into the ‘final’ (next) version of the main query:
I’d hate to try to write that final query out in one go. But by building it up in stages, I’m moderately confident, even without submitting it to a DBMS, that it is close to accurate. If I was testing it, I might go straight for the final query, but if there was a problem with it, then I’d test the component queries, working with one sub-query at a time until the parts were producing the correct results and then combining the total query.
Extension to handle date ranges and missing data again
In the comments, the MonthYearTable caused minor confusion. As noted in my response in the comments, the issue is that if you have data in tables A and B for January and March but for some peculiar reason there is no data for February, then the ‘final’ query will not show anything for February. If you want to see explicitly the (absence of) values for February, the MonthYearTable
can contain rows such as:
And you can select the months to be reported on from there, and do a LEFT OUTER JOIN with the extremum queries in the final table. That way, even though there’s no data in TableA or TableB for February (2011-02), the will be a result row showing that. And, supposing you actually had data in YearMonthTable for every month from January 2009 to December 2012, but you wanted the report to cover the period from July 2009 to June 2011, you’d need to specify the filter condition on MonthYearTable (and you’d probably also do it on TableA and TableB because the optimizer is unlikely to infer the sub-range for you).
You could apply more tweaks to the query, especially adding the date range filter in more places. You could consider using an expression such as:
to express the date range in the MonthYearTable. (For this purpose, the DATETIME YEAR TO MONTH type that Informix supports is ideal; the MonthYearTable need only contain a single column containing a value of that type.)
And so the story continues…you can play endlessly with the query, but as long as you build it up in pieces and apply the extra criteria systematically, you’ll be able to manage. Doing it ad hoc and trying for a big bang query (and not laying the queries out systematically) will just lead to confusion and disaster.
Analyzing the updated query in the question
Correlated sub-queries in the select-list, albeit in the select-list of a sub-query in the FROM clause of a main query; and LIMIT clauses too. Ouch! I tend to avoid writing a sub-query in the select-list when possible; they hurt my brain even more than the queries in the style I do write. OTOH, carefully handled, they sometimes do the necessary job.
When reformatted in my style, the revised query looks like:
That might work, but I’m not going to pontificate on that. I will say that most DBMS I’m familiar with would probably baulk on the
MAX(t3.rate)andMIN(t2.rate)terms. I would not trust the query without experimentation. I also tend not to trustLIMIT 1, doubly not when there’s no ordering criterion. It is at the whim of the DBMS which row is returned if there’s more than one row that the LIMIT could be applied to, and non-deterministic queries are generally a bad idea.So, while that might work, it is not what I’d ever use – even assuming that my DBMS accepted it. Actually, it’s easier than that for me; the way I think about queries would never come up with that one design, so there is essentially no risk of me formulating the query like that. Whether that’s good or not is a separate discussion.