Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 3637316
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 19, 20262026-05-19T01:08:36+00:00 2026-05-19T01:08:36+00:00

I am working on an application that works like this: It fetches data from

  • 0

I am working on an application that works like this:

  1. It fetches data from many sources, resulting in pool of about 500,000-1,500,000 records (depends on time/day)
  2. Data is parsed
  3. Part of data is processed in a way to compare it to pre-existing data (read from database), calculations are made, and stored in database. Resulting dataset that has to be stored in database is, however, much smaller in size (compared to original data set), and ranges from 5,000-50,000 records. This process almost always updates existing data, perhaps adds few more records.
  4. Then, data from step 2 should be kept somehow, somewhere, so that next time data is fetched, there is a data set which can be used to perform calculations, without touching pre-existing data in database. I should point out that this data can be lost, it’s not irreplaceable (key information can be read from database if needed), but it would speed up the process next time.

Application components can (and will be) run off different computers (in the same network), so storage has to be reachable from multiple hosts.

I have considered using memcached, but I’m not quite sure should I do so, because one record is usually no smaller than 200 bytes, and if I have 1,500,000 records, I guess that it would amount to over 300 MB of memcached cache… But that doesn’t seem scalable to me – what if data was 5x that amount? If it were to consume 1-2 GB of cache only to keep data in between iterations (which could easily happen)?

So, the question is: which temporary storage mechanism would be most suitable for this kind of processing? I haven’t considered using mysql temporary tables, as I’m not sure if they can persist between sessions, and be used by other hosts in network… Any other suggestion? Something I should consider?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-19T01:08:36+00:00Added an answer on May 19, 2026 at 1:08 am

    I know this sounds very old-school, but a temp file on your SAN would be easy and cheap.

    Loading a 300M file at the start of each run is trivial compared to consuming 300M of cache all the time.

    And if you can recreate it from the database keys, it would be wise to write and test that part and make it automatic that if the temp file was unavailable, the info would be mined from the keys and recreated.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm currently working on a small application that works like this: When the user
I am writing an application that fetches data from MS AX Dynamics once-in-a-very-long-while The
I am working on an application that is about 250,000 lines of code. I'm
I'm working on an application that is supposed to create products (like shipping insurance
I want to unit test a Java application that fetches mails from an email
I have a Core Data based mac application that is working perfectly well until
I'm working on an application for work that is going to query our employee
I have a web application that I'm working on for work and its not
We're working on an application that displays information through a Direct3D visualisation. A late
I am working on an application that installs a system wide keyboard hook. I

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.