Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6158387
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 23, 20262026-05-23T20:56:55+00:00 2026-05-23T20:56:55+00:00

This is more a conceptual question. It’s inspired from using some extremely large table

  • 0

This is more a conceptual question. It’s inspired from using some extremely large table where even a simple query takes a long time (properly indexed). I was wondering is there is a better structure then just letting the table grow, continually.

By large I mean 10,000,000+ records that grows every day by something like 10,000/day. A table like that would hit 10,000,000 additional records every 2.7 years. Lets say that more recent records are accesses the most but the older ones need to remain available.
I have two conceptual ideas to speed it up.

1) Maintain a master table that holds all the data, indexed by date in reverse order. Create a separate view for each year that holds only the data for that year. Then when querying, and lets say the query is expected to pull only a few records from a three year span, I could use a union to combine the three views and select from those.

2) The other option would be to create a separate table for every year. Then, again using a union to combine them when querying.

Does anyone else have any other ideas or concepts? I know this is a problem Facebook has faced, so how do you think they handled it? I doubt they have a single table (status_updates) that contains 100,000,000,000 records.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-23T20:56:57+00:00Added an answer on May 23, 2026 at 8:56 pm

    The main RDBMS providers all have similar concepts in terms of partitioned tables and partitioned views (as well as combinations of the two)

    There is one immediate benefit, in that the data is now split across multiple conceptual tables, so any query that includes the partition key within the query can automatically ignore any partition that the key would not be in.

    From a RDBMS management perspective, having the data divided into seperate partitions allows operations to be performed at a partition level, backup / restore / indexing etc. This helps reduce downtimes as well as allow for far faster archiving by just removing an entire partition at a time.

    There are also non relational storage mechanisms such as nosql, map reduce etc, but ultimately how it is used, loaded and data is archived become a driving factor in the decision of the structure to use.

    10 million rows is not that large in the scale of large systems, partitioned systems can and will hold billions of rows.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have an interesting situation here this time. This more of a conceptual question,
This is more of a conceptual question than anything. I have a base class
This is more of an academic inquiry than a practical question. Are there any
This is more of an generic XML Schema question, but if and how do
This is more an observation than a real question: MS-Access (and VBA in general)
This is more of a syntax question I'm trying to write a store procedure
This is more of a business-oriented programming question that I can't seem to figure
Ok this is more of a computer science question, than a question based on
EDIT: This question is more about language engineering than C++ itself. I used C++
This is a conceptual question involving Hadoop/HDFS. Lets say you have a file containing

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.