Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6662841
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 26, 20262026-05-26T02:26:02+00:00 2026-05-26T02:26:02+00:00

I know, variations of this question had been asked before. But my case may

  • 0

I know, variations of this question had been asked before. But my case may be a little different 🙂

So, I am building a site that tracks events. Each event has id and value. It is also performed by a user, which has id, age, gender, city, country and rank. (these attributes are all integers, if it matters)

I need to be able to quickly get answers to two queries:

  • get number of events from users with certain profile (for example, males with age 18-25 from Moscow, Russia)
  • get sum(maybe avg also) of values of events from users with certain profile –

Also, data is generated by multiple customers, which, in turn, can have multiple source_ids.

Access pattern: data will be mostly written by collector processes, but when queried (infrequently, by web ui) it has to respond quickly.

I expect LOTS of data, certainly more than one table or single server can handle.

I am thinking about grouping events in separate tables per day (that is, ‘events_20111011’). Also I want to prefix table name with customer id and source id, so that data is isolated and can be trivially discarded (purge old data) and relatively easily moved around (distribute load to other machines).
This way, every such table will have limited amount of rows, let’s say, 10M tops.

So, the question is: what to do with user’s attributes?

Option 1, normalized: store them in separate table and reference from event tables.

  • (pro) No repetition of data.
  • (con) joins, which are expensive (or so
    I heard).
  • (con) this requires user table and event tables to be on
    the same server

Option 2, redundant: store user attributes in event tables and index them.

  • (pro) easier load balancing (self-contained tables can be moved around)
  • (pro) simpler (faster?) queries
  • (con) lots of disk space and memory used for repeating user attributes and corresponding indexes
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-26T02:26:02+00:00Added an answer on May 26, 2026 at 2:26 am

    Your design should be normalized, you physical schema may end up denormalized for performance reasons.

    Is it possible to do both? There is a reason why SQL Server ships with Analysis Server. Even if you are not in the Microsoft realm, it is a common design to have a transactional system for the data entry and day to day processing while a reporting system is available for the kinds of queries that would cause heavy loads upon the transactional system.

    Doing this means you get the best of both worlds: a normalized system for daily operations and a denormalized system for rollup queries.

    In most cases nightly updates are fine for reporting systems, but it depends on your hours of operation and other factors what works best. I find most 8-5 businesses have more than enough time in the evening to update a reporting system.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I know this question has been asked in several variations before, but my question
I know that this sort of question has been asked here before, but still
I know that variations on this question have been asked, but I've tried all
I know there have been plenty of variations on this question before, but none
Variations of this question have been asked before, but it seems like the issue
Variations of this question have been asked, but not specific to GNU/Linux and C.
Variations on this question have been asked, but not this specifically. I have a
The Disclaimer First of all, I know this question (or close variations) have been
I know there are many variations of this question posted, but none I've found
Variations on this question have been asked many times. Vertical centering with CSS is

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.