I have a Postgres 9.1.4 table with a few million rows. A very small

Question

0

Editorial Team

Asked: June 8, 20262026-06-08T02:30:12+00:00 2026-06-08T02:30:12+00:00

I have a Postgres 9.1.4 table with a few million rows. A very small

0

I have a Postgres 9.1.4 table with a few million rows. A very small version could be this:

master

index  location
----------------
1      A
2      C
3      B
4      C
5      C
6      A

I need to do an expensive calculation on each distinct value in the location field. I don’t want to use this field from the master table because I would be repeatedly processing the same location. I want a table with distinct values of location, and the result of the calculation will be stored in a calculation field:

distinct

index  location  calculation
------------------------------
1      A'        X
2      C'        Y
3      B'        Z

The relationship between master and distinct is not easily determined after distinct is populated. I have to do a little data manipulation to make the locations work in the calculation. I really need a third table, created roughly at the same time I populate distinct, to help me correlate each entry in distinct back to its parent in master.

The third table might look like this:

correlation

master_index  distinct_index
------------------------------
1             1
2             3
3             2
4             3
5             3
6             1

The problem is I don’t see how this would be done with any straightforward SQL. I could use something like this as the beginning of a query to populate distinct:

SELECT location, array_agg(index)
FROM master
GROUP BY location;

The problem is I would need another array column in distinct to hold these values, then I’d later need to use some other program to parse the arrays and construct the correlation table.

Am I missing a simpler way of doing this?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-08T02:30:14+00:00

You can create the “distinct” table as follows (although, I caution you to avoid using SQL key words as names of tables as columns):

create table TDistinct as
    select m.location, min(index) as TDindex, <whatever> as calculation
    from master m
    group by m.location

Create the correlation table as:

create table correlation as
    select m.index as MasterIndex, td.TDIndex
    from master m join
         TDistinct td
         on m.location = td.location

These work, but you might want something more efficient. After creating the tables, you add indexes for efficiency. You can also do other tricks, such as precreating the distinct table with an auto incremented primary key, and using that as the distinct index. Then you would use insert to load the data into the query.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a Postgres 9.1.4 table with a few million rows. A very small

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply