Here is my setup: Table records contains multiple (more than two) PKID columns along

Question

0

Editorial Team

Asked: June 13, 20262026-06-13T07:41:51+00:00 2026-06-13T07:41:51+00:00

Here is my setup: Table records contains multiple (more than two) PKID columns along

0

Here is my setup:

Table records contains multiple (more than two) PKID columns along with some other columns.

Table cached_records only has two columns, which are the same as two of the PKIDs for records.

For instance, let’s assume records has PKIDs ‘keyA’, ‘keyB’, and ‘keyC’ and cached_records only has ‘keyA’ and ‘keyB’.

I need to pull the rows from the records table where the appropriate PKIDs (so, ‘keyA’ and ‘keyB’) are not in the cached_records table.

IF I was working with only ONE PKID, I know how simple this task would be:

SELECT
    pkid
FROM
    records
WHERE
    pkid NOT IN (SELECT pkid FROM cached_records)

However, the fact that there is two PKIDs means I can’t use a simple NOT IN. This is what I currently have:

SELECT
    `keys`.`keyA` AS `keyA`,
    `keys`.`keyB` AS `keyB`
FROM
    (
        SELECT DISTINCT
            `keyA`,
            `keyB`
        FROM
            `records`
    ) AS `keys`
        LEFT JOIN
                `cached_records` AS `cached`
            ON
                    `keys`.`keyA` = `cached`.`keyA`
                AND
                    `keys`.`keyB` = `cached`.`keyB`
WHERE
    (
            `cached`.`keyA` IS NULL
        AND
            `cached`.`keyB` IS NULL
    )

(The DISTINCT is needed because since I am only grabbing two of the multiple PKIDs from the records table, there could be duplicates and I really don’t need duplicates; ‘keyC’ is not being used and it helps determine uniqueness of the records).

This query above works just fine, however, as the cached_records table grows, the query takes longer and longer to process (we’re talking minutes now, sometimes takes long enough that my code hangs and crashes).

So, I’m wondering what the most efficient way is to do this kind of operation (selecting rows from one table where the rows don’t exist in another) with multiple PKIDS instead of just one…

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-13T07:41:52+00:00

This should be quicker:

SELECT  DISTINCT
    `records`.`keyA` AS `keyA`,
    `records`.`keyB` AS `keyB`
FROM
    `records`
        LEFT JOIN
                `cached_records` AS `cached`
            ON
                    `records`.`keyA` = `cached`.`keyA`
                AND
                    `records`.`keyB` = `cached`.`keyB`
WHERE
            `cached`.`keyA` IS NULL -- one is enough here

Notes:

with the query as table, you lose a lot of performance. You can do the distinct in the outmost SELECT here.
it is enough to check one of the two keys if they are null, as none can be null
you should verify that the keyA and keyB columns are of the same type, and no conversion occurs (seen such in working live code…)
You should have proper indexes on the tables. Minutes for this query is the sign of something awful going on… (Or an insane amount of data)

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Here is my setup: Table records contains multiple (more than two) PKID columns along

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply