Where start is pointing doesn't change. What start is pointing…

Question

0

Asked: May 13, 20262026-05-13T10:57:48+00:00 2026-05-13T10:57:48+00:00

I have two groups of files that contain data in CSV format with a

0

I have two groups of files that contain data in CSV format with a common key (Timestamp) – I need to walk through all the records chronologically.

Group A: ‘Environmental Data’
- Filenames are in format A_0001.csv, A_0002.csv, etc.
- Pre-sorted ascending
- Key is Timestamp, i.e.YYYY-MM-DD HH:MM:SS
- Contains environmental data in CSV/column format
- Very large, several GBs worth of data
Group B: ‘Event Data’
- Filenames are in format B_0001.csv, B_0002.csv
- Pre-sorted ascending
- Key is Timestamp, i.e.YYYY-MM-DD HH:MM:SS
- Contains event based data in CSV/column format
- Relatively small compared to Group A files, < 100 MB

What is best approach?

Pre-merge: Use one of the various recipes out there to merge the files into a single sorted output and then read it for processing
Real-time merge: Implement code to ‘merge’ the files in real-time

I will be running lots of iterations of the post-processing side of things. Any thoughts or suggestions? I am using Python.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-13T10:57:48+00:00

I would suggest pre-merge.

Reading a file takes a lot of processor time. Reading two files, twice as much. Since your program will be dealing with a large input (lots of files, esp in Group A), I think it would be better to get it over with in one file read, and have all your relevant data in that one file. It would also reduce the number of variables and read statements you will need.

This will improve the runtime of your algorithm, and I think that’s a good enough reason in this scenario to decide to use this approach

Hope this helps

How to approach applying for a job at a company ...

What is a programmer’s life like?

How to handle personal stress caused by utterly incompetent and ...

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions