I have a table that contains data for every day in 2002, but it has some missing dates. Namely, 354 records for 2002 (instead of 365). For my calculations, I need to have the missing data in the table with Null values
+-----+------------+------------+
| ID | rainfall | date |
+-----+------------+------------+
| 100 | 110.2 | 2002-05-06 |
| 101 | 56.6 | 2002-05-07 |
| 102 | 65.6 | 2002-05-09 |
| 103 | 75.9 | 2002-05-10 |
+-----+------------+------------+
you see that 2002-05-08 is missing. I want my final table to be like:
+-----+------------+------------+
| ID | rainfall | date |
+-----+------------+------------+
| 100 | 110.2 | 2002-05-06 |
| 101 | 56.6 | 2002-05-07 |
| 102 | | 2002-05-08 |
| 103 | 65.6 | 2002-05-09 |
| 104 | 75.9 | 2002-05-10 |
+-----+------------+------------+
Is there a way to do that in PostgreSQL?
It doesn’t matter if I have the result just as a query result (not necessarily an updated table)
dateis a reserved word in standard SQL and the name of a data type in PostgreSQL. PostgreSQL allows it as identifier, but that doesn’t make it a good idea. I usethedateas column name instead.Don’t rely on the absence of gaps in a surrogate ID. That’s almost always a bad idea. Treat such an ID as unique number without meaning, even if it seems to carry certain other attributes most of the time.
In this particular case, as @Clodoaldo commented,
thedateseems to be a perfect primary key and the columnidis just cruft – which I removed:Query
Full table by query:
Similar to what @a_horse_with_no_name posted, but simplified and ignoring the pruned
id.Fills in gaps between first and last date found in the table. If there can be leading / lagging gaps, extend accordingly. You can use
date_trunc()like @Clodoaldo demonstrated – but his query suffers from syntax errors and can be simpler.INSERT missing rows
The fastest and most readable way to do it is a
NOT EXISTSanti-semi-join.