I’m trying to develop a suitable BI solution whereby the dimension and fact tables have a 1:1 relationship.
Take, for example, the following:
Fact_UserData
- User ID
- Location ID
- Occupation ID
- a bunch of numeric data that can be meaningfully aggregated
Dim_User
- User ID
- Gender
- Ethnicity
Dim_Location
- Location ID
- District
- City
- State
Dim_Occupation
- Occupation ID
- Occupation Name
In this example, assume that Fact_UserData and Dim_User will always have a 1:1 relationship via User ID.
What’s primarily throwing me off is the 1:1 relationship – should I have a dedicated User dimension or should I merge those attributes into the fact table? I’m hesitant to merge since, according to Kimball, degenerate dimensions should be reserved for operational control numbers. I’m also wondering whether it makes sense to have occupation as a dedicated dimension – grouping by occupation is crucial from the perspective of the business, which is why I initially have it as its own dimension.
As a generalization to the occupation dimension question, what would be the best-practice approach to handling situations where dimensions only have two fields: ID and name? (Think of it like a typical Customers dimension, expect that it only has the fields Customer ID and Customer Name.) Assume that the dimension has 10+ entries and doesn’t have any hierarchies.
Well, considering that a person can change occupation and location, I would expect a
DateKeyin that fact table too. If you pull occupation and/or location into user dimension, you’ll end up with type-2 dimension, so will have to track temporal changes there.There is nothing wrong with having dimension just with
Key, BusinessKey— things change over time, you will eventually add something.