Trying not to select same connection multiple times I have tried to use charindex()

Question

0

Asked: May 18, 20262026-05-18T01:16:42+00:00 2026-05-18T01:16:42+00:00

Trying not to select same connection multiple times I have tried to use charindex()

0

Trying not to select same connection multiple times I have tried to use charindex() = 0 condition the following way:

WITH Cluster(calling_party, called_party, link_strength, Path)
AS
(SELECT
    calling_party,
    called_party,
    link_strength,
    CONVERT(varchar(max), calling_party + '.' + called_party) AS Path
FROM
    monthly_connections_test
WHERE
    link_strength > 0.1 AND
    calling_party = 'b'
UNION ALL
SELECT
    mc.calling_party,
    mc.called_party,
    mc.link_strength,
    CONVERT(varchar(max), cl.Path + '.' + mc.calling_party + '.' + mc.called_party) AS Path
FROM
    monthly_connections_test mc
INNER JOIN Cluster cl ON
    (
        mc.called_party = cl.called_party OR
        mc.called_party = cl.calling_party
    ) AND
    (
        CHARINDEX(cl.called_party + '.' + mc.calling_party, Path) = 0 AND
        CHARINDEX(cl.called_party + '.' + mc.called_party, Path) = 0
    )
WHERE
    mc.link_strength > 0.1
)
SELECT
    calling_party,
    called_party,
    link_strength,
    Path
FROM
    Cluster OPTION (maxrecursion 30000)

The condition however does not fulfill its purpose as the same rows are selected multiple times.

The actual aim here is to retrieve the whole cluster of connections to which the selected user (in the example user b) belongs.

Edit1:

I tried to modify the query the following way:

With combined_users AS
 (SELECT calling_party CALLING, called_party CALLED, link_strength FROM dbo.monthly_connections_test WHERE link_strength > 0.1),
 related_users1 AS
 (
 SELECT c.CALLING, c.CALLED, c.link_strength, CONVERT(varchar(max), '.' + c.CALLING + '.' + c.CALLED + '.') path from combined_users c where CALLING = 'a1'
 UNION ALL
 SELECT c.CALLING, c.CALLED, c.link_strength,
    convert(varchar(max),r.path  + c.CALLED + '.') path 
        from combined_users c 
        join related_users1 r 
        ON (c.CALLING = r.CALLED) and CHARINDEX(c.CALLING + '.' + c.CALLED + '.', r.path)= 0 

        ),
related_users2 AS
(
SELECT c.CALLING, c.CALLED, c.link_strength, CONVERT(varchar(max), '.' + c.CALLING + '.' + c.CALLED + '.') path from combined_users c where CALLED = 'a1'
UNION ALL
 SELECT c.CALLING, c.CALLED, c.link_strength,
    convert(varchar(max),r.path  + c.CALLING + '.') path 
        from combined_users c 
        join related_users2 r 
        ON c.CALLED = r.CALLING and CHARINDEX('.' + c.CALLING + '.' + c.CALLED, r.path)= 0
)
        SELECT CALLING, CALLED, link_strength, path FROM
        (SELECT CALLING, CALLED, link_strength, path FROM related_users1 UNION SELECT CALLING, CALLED, link_strength, path FROM related_users2) r OPTION (MAXRECURSION 30000)

To test the query I created the cluster below:

alt text

The query above returned the following table:

a1  a2  1.0000000   .a1.a2.
a11 a13 1.0000000   .a12.a1.a13.a11.
a12 a1  1.0000000   .a12.a1.
a13 a12 1.0000000   .a12.a1.a13.
a14 a13 1.0000000   .a12.a1.a13.a14.
a15 a14 1.0000000   .a12.a1.a13.a14.a15.
a2  a10 1.0000000   .a1.a2.a10.
a2  a3  1.0000000   .a1.a2.a3.
a3  a4  1.0000000   .a1.a2.a3.a4.
a3  a6  1.0000000   .a1.a2.a3.a6.
a4  a8  1.0000000   .a1.a2.a3.a4.a8.
a4  a9  1.0000000   .a1.a2.a3.a4.a9.

The query obviously returns connections toward the selected node and the connections in the opposite direction. The problem is the change of direction: for example the connections a7, a4 and a11, a10 are not selected because the direction is changed (relative to starting node).

Does anyone know how to modify the query in order to include all connections?

Thank you

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-18T01:16:42+00:00

Okay, there are several things to discuss here.

Zerothly, i have PostgreSQL, so all this is done with that; i’m trying to only use standard SQL, so this should work okay with SQL Server too.

Firstly, if you’re only interested in calls with link strength greater than 0.1, let’s say:

-- like calls, but only strong enough to be interesting
create view strong_calls (calling_party, called_party, link_strength)
as (
  select calling_party, called_party, link_strength
  from monthly_connections_test
  where link_strength > 0.1
);

And from now on, we’ll talk in terms of this table.

Secondly, you say:

The actual aim here is to retrieve the whole cluster of connections to which the selected user (in the example user b) belongs.

If that’s true, why are you computing the path? If you just want to know the set of connections, you can do:

with recursive cluster (calling_party, called_party, link_strength)
as (
  (
    select calling_party, called_party, link_strength
    from strong_calls
    where calling_party = 'b'
  )
  union
  (
    select c.calling_party, c.called_party, c.link_strength
    from cluster this, strong_calls c
    where c.calling_party = this.called_party
    or c.called_party = this.calling_party
  )
)
select *
from cluster;

Thirdly, perhaps you don’t really want to find what connections are in the cluster, you want to find which people are in the cluster, and what the shortest path to them from the target is. In that case, you can do:

with recursive cluster (party, path)
as (
  select cast('b' as character varying), cast('b' as character varying)
  union
  (
    select (case
      when this.party = c.calling_party then c.called_party
      when this.party = c.called_party then c.calling_party
    end), (this.path || '.' || (case
      when this.party = c.calling_party then c.called_party
      when this.party = c.called_party then c.calling_party
    end))
    from cluster this, strong_calls c
    where (this.party = c.calling_party and position(c.called_party in this.path) = 0)
    or (this.party = c.called_party and position(c.calling_party in this.path) = 0)
  )
)
select party, path
from cluster
where not exists (
  select *
  from cluster c2
  where cluster.party = c2.party
  and (
    char_length(cluster.path) > char_length(c2.path)
    or (char_length(cluster.path) = char_length(c2.path)) and (cluster.path > c2.path)
  )
)
order by party, path;

As you can see, you were very much on the right track.

If you actually do want a list of all the calls in the cluster, with paths, then, er, i’ll get back to you!

EDIT: Bear in mind that the queries which don’t construct a path will have very different performance characteristics to those which do. Roughly speaking, the non-path queries will do O(n) work (probably in about O(log n) iteration steps), because they visit each node in the cluster, but the path-constructing steps will do much more – O(n!) maybe? – because they have to visit every path through the graph. You’ll be alright if clusters are as big as in your examples, but if they’re much bigger, you may find the runtime is prohibitive.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Trying not to select same connection multiple times I have tried to use charindex()

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply