I have this joining table: CREATE TABLE [TagMap]( [intItemId] [bigint] NOT NULL, [intTagId] [bigint]

Question

0

Asked: June 1, 20262026-06-01T21:15:31+00:00 2026-06-01T21:15:31+00:00

I have this joining table: CREATE TABLE [TagMap]( [intItemId] [bigint] NOT NULL, [intTagId] [bigint]

0

I have this joining table:

 CREATE TABLE [TagMap](
[intItemId] [bigint] NOT NULL,
[intTagId] [bigint] NOT NULL,
 CONSTRAINT [PK_TagMap_intItemId] PRIMARY KEY CLUSTERED 
 (
[intItemId] ASC,
[intTagId] ASC
 ))

My sproc discovers the tags that are associated with a seedItemId and then selects other intItemIds that are associated with those tags, like this:

 declare @baseTags table    (intTagId bigint primary key clustered);

INSERT INTO @baseTags (intTagId)
SELECT TOP 20 t1.intTagId
FROM   TagMap t1
WHERE t1.intItemId = 776

 SELECT TOP 500 t1.intItemId
 FROM   TagMap t1
      JOIN @baseTags t2
       ON t1.intTagId = t2.intTagId
 GROUP  BY t1.intItemId
 ORDER  BY Count(*) DESC

What I need to to is specify a minimum number of tag associations – say 10 – between two intItemIds below which a value is not returned. In other words in the TagMap table can find ten or more intTagIds that two intItemIds have in common, it’s good and we select it – otherwise ignore it.

So, for example given this data:

 CREATE TABLE #TagMap(
 [intItemId] [bigint] NOT NULL,
 [intTagId] [bigint] NOT NULL,
 CONSTRAINT [PK_TagMap_intItemId] PRIMARY KEY CLUSTERED 
 (
 [intItemId] ASC,
 [intTagId] ASC
 ))

 insert into #TagMap
 (intItemId, intTagId)
 values 
 (1, 100),(1, 200),(1, 300),
 (2, 100),(2, 200),         (2, 500),(2, 600),
 (3, 100),                  (3, 500),(3, 600)

Assume the match threshold is two.

If the seed is intItemId 1, then only intItemId 2 should be returned (it has two matching tag Ids: 100 and 200, while intItemId 3 has one: 100, which is below the threshold).

If the seed is intItemId 2, then both intItemId 1 and 3 should be returned (intItemId 1 matches the tag Ids 100 and 200, while intItemId 3 matches the tag Ids 500 and 600).

If the seed is intItemId 3, then only intItemId 2 should be returned (it has two matching tag Ids: 500 and 600, while intItemId has one: 100 which is below the threshold).

Any ideas how to do this, please?

Cheers,
Matt

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-01T21:15:33+00:00

SELECT
  [foreign].intItemID
FROM
  #TagMap    AS [primary]
INNER JOIN
  #TagMap    AS [foreign]
    ON [foreign].intTagID = [primary].intTagID
WHERE
  [primary].intItemID = 1
GROUP BY
  [foreign].intItemID
HAVING
  COUNT(distinct [foreign].intTagID) >= @threshold

Note, however, that this scales pretty badly; because the JOIN searches for “has any of these tags”, and then only in the HAVING clause can you specify “has all of these tags”.

From my experience you can make minor optimisations, but the most rewarding approach I found was to cache the results into a more standard mapping table. Updating them as and when necessary. (The contents of the tags data changing very infrequently.)

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have this joining table: CREATE TABLE [TagMap]( [intItemId] [bigint] NOT NULL, [intTagId] [bigint]

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply