I am currently trying to process a large block of simulation data (~2Gb worth).

Question

0

Asked: May 24, 20262026-05-24T09:11:45+00:00 2026-05-24T09:11:45+00:00

I am currently trying to process a large block of simulation data (~2Gb worth).

0

I am currently trying to process a large block of simulation data (~2Gb worth). The data is in a table which looks like:

Table: Simulation Data
+-------+--------+----------+-------+
|   id  | run_id | timestep | value |
+-------+--------+----------+-------+
|     1 |   1    |     1    | 0.00  |
|     2 |   1    |     2    | 0.003 |
|     : |   :    |     :    |   :   |
|  9543 |   1    |  9543    | 0.23  |
|  9544 |   2    |     1    | 0.00  |
|     : |   :    |     :    |   :   |
+-------+--------+----------+-------+

So for each run (identified by a run_id) there are a number of time steps with corresponding data (in the case of run_id 1, there were 9543 time steps).

Durring a simulation run, there are events which take place. These event time steps are recorded in another table:

Table: Simulation Events
+-------+--------+----------+
|   id  | run_id | timestep |
+-------+--------+----------+
|    1  |   1    |  152     |
|    2  |   1    |  193     |
|    3  |   1    |  382     |
|    :  |   :    |   :      |
|  143  |   1    |  9382    |
|  144  |   2    |  137     |
|    :  |   :    |   :      |
+-------+--------+----------+

So for this set of data, with run_id 1, there were events at time step 152, 193, 382, … 9382. run_id 2 has its first event at time step 137, etc. I am interested in what happens in the 3-timesteps before, the time step of, and the 3-timesteps after each event for each run_id. I would like to put together a query that returns something that looks like:

+--------+----------------+----------+-------+
| run_id | event_timestep | delta_ts | value |
+--------+----------------+----------+-------+
|    1   |      152       |   -3     | 0.053 |
|    1   |      152       |   -2     | 0.042 |
|    1   |      152       |   -1     | 0.031 |
|    1   |      152       |    0     | 0.003 |
|    1   |      152       |    1     | 0.532 |
|    1   |      152       |    2     | 0.736 |
|    1   |      152       |    3     | 1.138 |
|    1   |      193       |   -3     | 0.049 |
|    :   |       :        |    :     |   :   |
|    1   |     9382       |   -3     | 0.068 |
|    :   |       :        |    :     |   :   |
|    1   |     9382       |    3     | 1.523 |
+--------+----------------+----------+-------+

Where the first row, with delta_ts = -3 would be the value from timestep 149, -2 would be from timestep 150, -1 from timestep 151, etc. Any thoughts on putting together a query that would do this?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-24T09:11:46+00:00

There’s two differing points of view on this:

You can use blank joins (cartesian products) select ... from table t1, table t2 where ..., but you have to figure out a condition that links two rows if and only if they’re “related”. Also keep in mind that pairs are commutative in your example, so add a condition like t1.id<t2.id — also excludes self-joins.
or you can use a cursor in a stored procedure, and store values for the previous n steps, and correlate them manually. This is slower, uses more memory, but it’s easier to write.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am currently trying to process a large block of simulation data (~2Gb worth).

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply