I’m on an optimization kick, at the moment.
I tend to use multiple tables, so I don’t have empty columns.
My question is, are empty columns a big deal? I’m not talking for space. I’m referring to speed of indexing, data retrieval, etc…
My bet example is when I have a simple customers table, and some columns are not always filled. Like email, dob, ssn, or pic. I’d say most of the time they are not filled in.
That causes me to create a new table to house just the ancillary data.
but would it really make a difference if I put these columns in the same table with the rest of the customer info?
If I do this, then there will be many records with empty columns. Which causes me to wonder how much this affects performance when the record count is huge.
If you’re on an optimization kick, my advice is to get off it 🙂
Optimization is something that should be done in response to a performance problem, not a whim. If there’s no performance problem, all optimization is wasted effort.
Empty fields rarely make a large difference to data retrieval in a properly designed schema since most queries should, as much as possible, use indexes only for deciding which rows to get. Once the rows are discovered, that’s when you go to the table to get the actual data.
And speed of indexing won’t change just because the column is stored in another table. If it needs to be indexed, then it needs to be indexed.
I prefer my schema to be as simple as possible (while still mostly following 3NF) so as to avoid unnecessary joins.