I am trying to work out a bug we’ve found during our last iteration

Question

0

Asked: May 16, 20262026-05-16T07:12:54+00:00 2026-05-16T07:12:54+00:00

I am trying to work out a bug we’ve found during our last iteration

0

I am trying to work out a bug we’ve found during our last iteration of testing. It involves a query which uses a common table expression. The main theme of the query is that it simulates a ‘first’ aggregate operation (get the first row for this grouping).

The problem is that the query seems to choose rows completely arbitrarily in some circumstances – multiple rows from the same group get returned, some groups simply get eliminated altogether. However, it always picks the correct number of rows.

I have created a minimal example to post here. There are clients and addresses, and a table which defines the relationships between them. This is a much simplified version of the actual query I’m looking at, but I believe it should have the same characteristics, and it is a good example to use to explain what I think is going wrong.

CREATE TABLE [Client] (ClientID int, Name varchar(20))
CREATE TABLE [Address] (AddressID int, Street varchar(20))
CREATE TABLE [ClientAddress] (ClientID int, AddressID int)

INSERT [Client] VALUES (1, 'Adam')
INSERT [Client] VALUES (2, 'Brian')
INSERT [Client] VALUES (3, 'Charles')
INSERT [Client] VALUES (4, 'Dean')
INSERT [Client] VALUES (5, 'Edward')
INSERT [Client] VALUES (6, 'Frank')
INSERT [Client] VALUES (7, 'Gene')
INSERT [Client] VALUES (8, 'Harry')

INSERT [Address] VALUES (1, 'Acorn Street')
INSERT [Address] VALUES (2, 'Birch Road')
INSERT [Address] VALUES (3, 'Cork Avenue')
INSERT [Address] VALUES (4, 'Derby Grove')
INSERT [Address] VALUES (5, 'Evergreen Drive')
INSERT [Address] VALUES (6, 'Fern Close')

INSERT [ClientAddress] VALUES (1, 1)
INSERT [ClientAddress] VALUES (1, 3)
INSERT [ClientAddress] VALUES (2, 2)
INSERT [ClientAddress] VALUES (2, 4)
INSERT [ClientAddress] VALUES (2, 6)
INSERT [ClientAddress] VALUES (3, 3)
INSERT [ClientAddress] VALUES (3, 5)
INSERT [ClientAddress] VALUES (3, 1)
INSERT [ClientAddress] VALUES (4, 4)
INSERT [ClientAddress] VALUES (4, 6)
INSERT [ClientAddress] VALUES (5, 1)
INSERT [ClientAddress] VALUES (6, 3)
INSERT [ClientAddress] VALUES (7, 2)
INSERT [ClientAddress] VALUES (8, 4)
INSERT [ClientAddress] VALUES (5, 6)
INSERT [ClientAddress] VALUES (6, 3)
INSERT [ClientAddress] VALUES (7, 5)
INSERT [ClientAddress] VALUES (8, 1)
INSERT [ClientAddress] VALUES (5, 4)
INSERT [ClientAddress] VALUES (6, 6)

;WITH [Stuff] ([ClientID], [Name], [Street], [RowNo]) AS
(
    SELECT
        [C].[ClientID],
        [C].[Name],
        [A].[Street],
        ROW_NUMBER() OVER (ORDER BY [A].[AddressID]) AS [RowNo]
    FROM
        [Client] [C] INNER JOIN
        [ClientAddress] [CA] ON
            [C].[ClientID] = [CA].[ClientID] INNER JOIN
        [Address] [A] ON
            [CA].[AddressID] = [A].[AddressID]
)
SELECT
    [CTE].[ClientID],
    [CTE].[Name],
    [CTE].[Street],
    [CTE].[RowNo]
FROM
    [Stuff] [CTE]
WHERE
    [CTE].[RowNo] IN (SELECT MIN([CTE2].[RowNo]) FROM [Stuff] [CTE2] GROUP BY [CTE2].[ClientID])
ORDER BY
    [CTE].[Name] ASC,
    [CTE].[Street] ASC

DROP TABLE [ClientAddress]
DROP TABLE [Address]
DROP TABLE [Client]

The query is designed to get all clients, and their first address (the address with the lowest ID). This appears to me that it should work.

I have a theory about why it sometimes will not work. The statement that follows the CTE refers to the CTE in two places. If the CTE is non-deterministic, and it gets run more than once, the result of the CTE may be different in the two places it’s referenced.

In my example, the CTE’s RowNo column uses ROW_NUMBER() with an order by clause that will potentially result in different orderings when run multiple times (we’re ordering by address, the clients can be in any order depending on how the query is executed).

Because of this is it possible that CTE and CTE2 can contain different results? Or is the CTE only executed once and do I need to look elsewhere for the problem?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-16T07:12:55+00:00

It is not guaranteed in any way.

SQL Server is free to evaluate CTE each time it’s accessed or cache the results, depending on the plan.

You may want to read this article:

Generating XML in subqueries

If your CTE is not deterministic, you will have to store its result in a temporary table or a table variable and use it instead of the CTE.

PostgreSQL, on the other hand, always evaluates CTEs only once, caching their results.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am trying to work out a bug we’ve found during our last iteration

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply