Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 116723
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 11, 20262026-05-11T03:13:23+00:00 2026-05-11T03:13:23+00:00

Let’s say I’m getting a large (2 million rows?) amount of data that’s supposed

  • 0

Let’s say I’m getting a large (2 million rows?) amount of data that’s supposed to be static and unchanging. Supposed to be. And this data gets republished monthly. What methods are available to 1) be aware of what data points have changed from month to month and 2) consume the data given a point in time?

Solution 1) Naively save every snapshot of data, annotated by date. Diff awareness is handled by some in-house program, but consumption of the data by date is trivial. Cons, space requirements balloon by an order of magnitude.

Solution 2A) Using an in-house program, track when the diffs happen and store them in an EAV table, annotated by date. Space requirements are low, but consumption integrated with the original data becomes unwieldly.

Solution 2B) Using an in-house program, track when the diffs happen and store them in a sparsely filled table that looks much like the original table, filled only with the data that’s changed and the date when changed. Cons, model is sparse and consumption integrated with the original data is non-trivial.

I guess, basically, how do I integrate the dimension of time into a relational database, keeping in mind both the viewing of the data and awareness of differences between time periods?

Does this relate to data warehousing at all?

Smells like… Slowly changing dimension?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. 2026-05-11T03:13:24+00:00Added an answer on May 11, 2026 at 3:13 am

    I had a similar problem – big flat files imported to the database once per day. Most of the data is unchanging.

    Add two extra columns to the table, starting_date and ending_date. The default value for ending_date should be sometime in the future.

    To compare one file to the next, sort them both by the key columns, then read one row from each file.

    • If the keys are equal: compare the rest of the columns to see if the data has changed. If the row data is equal, the row is already in the database and there’s nothing to do; if it’s different, update the existing row in the database with an ending_date of today and insert a new row with a starting_date of today. Read a new row from both files.
    • If the key from the old file is smaller: the row was deleted. Update ending_date to today. Read a new row from the old file.
    • If the key from the new file is smaller: a row was inserted. Insert the row into the database with a starting_date of today. Read a new row from the new file.

    Repeat until you’ve read everything from both files.

    Now to query for the rows that were valid at any date, just select with a where clause test_date between start_date and end_date.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Let's say we have a table that has 2 million rows. It has two
Let's say that I have a set of relations that looks like this: relations
Let's say I have rows of data retrieved from a relational database tables (perhaps
Let's say that I have a method with the signature: public static bool ValidDateToSend(DateTime
Let's say I have this MySQL table: OK.. see the type field? Type 0
Let's say i have this block of code, <div id=id1> This is some text
Let's say I dynamically create a timer like this: System.Timers.Timer expirationTimer = new Timer(expiration
Let's say on a page I have alot of this repeated: <div class=entry> <h4>Magic:</h4>
Let's say I can call a method like this: core::get() . What is the
Let's say I have a text file composed like this ##### typeofthread1 ##### typeofthread2

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.