I’ve been given a stack of data where a particular value has been collected sometimes as a date (YYYY-MM-DD) and sometimes as just a year.
Depending on how you look at it, this is either a variance in type or margin of error.
This is a subprime situation, but I can’t afford to recover or discard any data.
What’s the optimal (eg. least worst 🙂 ) SQL table design that will accept either form while avoiding monstrous queries and allowing maximum use of database features like constraints and keys*?
*i.e. Entity-Attribute-Value is out.
+1 each to recommendations from ninesided, Nikki9696 and Jeff Siver – I support all those answers though none was exactly what I decided upon.
My solution:
Advantages:
I would argue that methods using
YYYY-01-01to signify missing data (when flagged as such with a second explanatory column) fail seriously on points 1 and 5.Example code for Sqlite 3: