I have a directed graph described by A -> B meaning that there exists a connection, always weight 1, FROM A TO B. The problem I want to solve (This is not an academic project) is how can I tell how many common connections there are between two nodes.
To say that in terms of A and B.
There are 2 things that need to get done,
* To look at all of my, B, links coming in (All A’s to some B)
* Count how many common A’s out of all My, Original B, A’s.
I do not know if that makes sense but I’ll show you how far I have come.
* First point.
SELECT A
FROM graph
WHERE B='myid';
As most can tell, part 1 is a very simple question. Part 2 is where things get tricky.
I have been able to get all the A’s with at least 1 connection or more similar.
Second point.
SELECT G.A, count( G2.A ) AS common
FROM graph AS G2
JOIN (
SELECT A, B
FROM graph
WHERE B = 'myid'
) AS G ON G.A = G2.B
So the second point is close because it will return all common links, but it will not return all links which have no common links. Is there a way to get that too?
There is still confusion: Ill try to draw a picture… with words.
Here is the table.
A, B
-----
2, 1
3, 1
2, 3
If I wanted to see how many common links from all incoming links into NODE 1 I should see
A, count
---------
2, 1 // This is for 2's connection to 3.
3, 0
With the current SQL statement I have I see this.
A, count
---------
2, 1 // This is for 2's connection to 3.
Instead of using a subquery, I would just use JOINs:
This gives the answer that you are expecting in your example. You may want to test it out on a wider sample.
You should also do performance testing on this if your data set is of any significant size.