I have to generate about a million random trips between about 40K destinations. Each

Question

0

Asked: May 22, 20262026-05-22T23:27:05+00:00 2026-05-22T23:27:05+00:00

I have to generate about a million random trips between about 40K destinations. Each

0

I have to generate about a million random trips between about 40K destinations. Each destination has it’s own weight (total_probability), the more it is, the more trips should start or end in this place.

Either the trips should be generated randomly, but destinations (start and end points) should be weighted by probability, or it’s possible to just pre-calculate an exact number of trips (divide each weight by the sum of weights, multiply by 1M and round to integers).

Problem is how to make it in PostgreSQL without generating the 40K*40K table with all destinations pairs.

          Table "public.dests"
   Column          |       Type       | Modifiers 
-------------------+------------------+-----------
 id                | integer          | 
 total_probability | double precision | 

          Table "public.trips"
   Column   |       Type       | Modifiers 
------------+------------------+-----------
 from_id    | integer          | 
 to_id      | integer          | 
 trips_num  | integer          | 
 ...
 some other metrics...

primary key for trips is (from_id, to_id)
Should I generate a table with 1M records and then update it iteratively, or a for loop with 1M inserts will be fast enough? I work on a 2-core lightweight laptop.

P.S. I gave up and did this in Python. To perform a set of queries and the transformation in Python, I’ll run SQL scripts from Python rather than from a shell script. Thanks for suggestions!

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-22T23:27:06+00:00

In 9.1, you can use TRIGGERs on VIEWs, which effectively let you create materialized views (albeit manually). I think your first run may be expensive, but using a loop is probably the way to go, but then after that, I’d use a series of TRIGGERs to maintain the data in a table.

At the end of the day you need to decide whether or not you want to calculate the results for every query, or you memoize the result via a materialized view.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have to generate about a million random trips between about 40K destinations. Each

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply