Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8236653
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 7, 20262026-06-07T19:12:07+00:00 2026-06-07T19:12:07+00:00

I’m about to create a data warehouse with facts and dimensions in a star-schema.

  • 0

I’m about to create a data warehouse with facts and dimensions in a star-schema.

The business questions I want to answer are typically these:

  • How much money did we sell for in Q1?
  • How much money did we sell for in Q1 to females?
  • How much money did we sell for in Q1 to females between age 30-35?
  • How much money did we sell for in Q1 to females between age 30-35 living in new york?
  • How much money did we sell for in Q1 to females between age 30-35 living in new york?

  • How much money did we sell for in category clothes last year?

  • How much money did we sell for of the product blue jeans last year?
  • How much money did we sell for of the product blue jeans to males between 40-42 living in Australia last year?

I am thinking of a date dimension with the granularity of an hour (specifying year, month, day, hour, quarter, name of day, name of month etc.)
I am also thinking of a product dimension and a user dimension.

I wonder if these questions could be answered using a single fact table or if its proper to create multiple fact tables? I am thinking of a table such as:

FactSales

DimDate – fk to a table containting information about the date (such as quarter, day of week, year, month, day)

DimProduct – fk to a table containing information about the product such as (product name)

DimUser – fk to a table containing information about the user such as (age, gender)

TotalSales – a SUM of all sales for those particular date,product and user.

Also, if I would like to measure booth the total sales (money) and the total number of sales? Would it be proper to create a new fact table with the same dimensions but using TotalNumberOfSales as the fact instead?

Thankful for all input I can get about this.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-07T19:12:09+00:00Added an answer on June 7, 2026 at 7:12 pm

    I think you are on the right track. All questions above should be possible to answer using only one fact table covering up the sales.

    I think one should start out unaggregated, and rather aggregate later if needed. Considering that one sale can contain multiple products and multiple items, I’d organize it as follows … one fact row for each product in the sale (typically lines on the invoice, so I’d call it “order lines” or “sale lines”), and maybe three counter attributes:

    • NumItems – number of items, i.e. 3 if the customer bought three of the same product.
    • NumLines – number of “order lines” – should always be 1. May be useful when aggregating data later (big win to already have sum(NumLines) rather than count(*) in the SQL), or when adding correction items (NumLines = -1).
    • NumSales – a fractional number so it can be summed up to yield the number of sales (i.e. 0.333 if the sale involves three different products and hence contains three order lines).

    Now, one will get a problem to get the right count i.e. for “number of sales involving black clothes”. We had this problem at my previous workplace – I’m sure there must exist some “best practice” for this, we ended up more or less by introducing a SaleID in the fact table (or TransactionID) and do count(distinct SaleID). That lacks elegance, but works.

    In our setup we had several money attributes – most important, one for the revenue (what’s left of the income after paying the direct costs attributed with the items sold) and one for the turnover (the price paid by the customer for the item). Sales tax or VAT may add more complications. One can make it with only one money attribute and then split the sales up into multiple lines in the fact table, but I think I would rather recommend multiple money columns in the sales line fact table. Everything in the fact table was counted in “base currency” (Euros, in our case), and then we had an exchange rate dimension to track the exact amounts.

    I don’t think it makes sense to have a date dimension containing the hour of the day. At my former work I kept my warehouse in postgres, and I actually managed quite well without a date dimension at all – although a date dimension is considered “best business practice” I found that performance-wise for all our purposes we got much better performance by using standard postgres date functions instead of dragging in a date dimension. I was playing quite a lot with it, and I think in the end I found the most optimal was to split up date and time into two different attributes. (Timezones and daylight saving gave me quite some extra headaches…)

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I want to construct a data frame in an Rcpp function, but when I
link Im having trouble converting the html entites into html characters, (&# 8217;) i
I want to count how many characters a certain string has in PHP, but
Basically, what I'm trying to create is a page of div tags, each has
I am reading a book about Javascript and jQuery and using one of the
I have a string like this: La Torre Eiffel paragonata all’Everest What PHP function
I have a French site that I want to parse, but am running into
I want use html5's new tag to play a wav file (currently only supported
I'm parsing an RSS feed that has an ’ in it. SimpleXML turns this
i want to parse a xhtml file and display in UITableView. what is the

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.