I want to store a user’s gender in a database with as little (size/performance) cost as possible.
So far, 3 scenarios come to mind
- Int – aligned with Enum in code (1 = Male, 2 = Female, 3 = …)
- char(1) – Store m, f or another single character identifier
- Bit (boolean) – is there an appropriate field name for this option?
The reason I ask is because of this answer which mentions that chars are smaller than booleans.
I should clarify that I’m using MS SQL 2008, which DOES in fact have the bit datatype.
I’d call the column “gender”.
The BIT data type can be ruled out because it only supports two possible genders which is inadequate. While INT supports more than two options, it takes 4 bytes — performance will be better with a smaller/more narrow data type.
CHAR(1)has the edge over TinyINT – both take the same number of bytes, but CHAR provides a more narrow number of values. UsingCHAR(1)would make using “m”, “f”,etc natural keys, vs the use of numeric data which are referred to as surrogate/artificial keys.CHAR(1)is also supported on any database, should there be a need to port.Conclusion
I would use Option 2: CHAR(1).
Addendum
An index on the gender column likely would not help because there’s no value in an index on a low cardinality column. Meaning, there’s not enough variety in the values for the index to provide any value.