This is essentially a question about constructing an SQL query. The db is implemented with sqlite3. I am a relatively new user of SQL.
I have two tables and want to join them in an unusual way. The following is an example to explain the problem.
Table 1 (t1):
id year name
-------------------------
297 2010 Charles
298 2011 David
300 2010 Peter
301 2011 Richard
Table 2 (t2)
id year food
---------------------------
296 2009 Bananas
296 2011 Bananas
297 2009 Melon
297 2010 Coffee
297 2012 Cheese
298 2007 Sugar
298 2008 Cereal
298 2012 Chocolate
299 2000 Peas
300 2007 Barley
300 2011 Beans
300 2012 Chickpeas
301 2010 Watermelon
I want to join the tables on id and year. The catch is that (1) id must match exactly, but if there is no exact match in Table 2 for the year in Table 1, then I want to choose the year that is the next (lower) available. A selection of the kind that I want to produce would give the following result
id year matchyr name food
-------------------------------------------------
297 2010 2010 Charles Coffee
298 2011 2008 David Cereal
300 2010 2007 Peter Barley
301 2011 2010 Richard Watermelon
To summarise, id=297 had an exact match for year=2010 given in Table 1, so the corresponding line for id=297, year=2010 is chosen from Table 2. id=298, year=2011 did not have a matching year in Table 2, so the next available year (less than 2011) is chosen. As you can see, I would also like to know what that matched year (whether exactly , or inexactly) actually was.
I would very much appreciate (1) an indication (yes/no answer) of whether this is possible to do in SQL alone, or whether I need to look outside SQL, and (2) a solution, if that is not too onerous.
Sure, you can do this in SQL. What you need for the first step is a query which joins t1 and t2 on id, and gives you the various possible years (EDIT: fixed this sentence, the original one was incorporating the next step into it as well). You can create a view called t3 which gives you this as follows:
CREATE VIEW [t3] ASSELECT t1.id as id, t2.year as year FROM t1 INNER JOIN t2 ON (t1.id = t2.id AND T2.year <= t1.year)
From here, you’ll want to get the maximum year from t3 per id:
CREATE VIEW [t4] ASSELECT id, MAX(year) FROM t3 GROUP BY id
You should be able to join t4 back onto t1 and t2 to get what you want. Of course, there are ways to shorten this, but it should get you thinking along the right lines. Also, I’m guessing it’s just an example, but please don’t call your tables t1, t2, etc.