I was wondering how can we can use the python module networkX to implement

Question

0

Asked: June 6, 20262026-06-06T13:07:11+00:00 2026-06-06T13:07:11+00:00

I was wondering how can we can use the python module networkX to implement

0

I was wondering how can we can use the python module networkX to implement SimRank to compare the similarity of 2 nodes? I understand that networkX provides methods for looking at neighbors, and link analysis algorithms such as PageRank and HITS, but is there one for SimRank?

Examples, tutorials are welcomed too!

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-06T13:07:13+00:00

Update
I implemented an networkx_addon library. SimRank is included in the library. Check out: https://github.com/hhchen1105/networkx_addon for details.

Sample Usage:

    >>> import networkx
    >>> import networkx_addon
    >>> G = networkx.Graph()
    >>> G.add_edges_from([('a','b'), ('b','c'), ('a','c'), ('c','d')])
    >>> s = networkx_addon.similarity.simrank(G)

You may obtain the similarity score between two nodes (say, node ‘a’ and node ‘b’) by

    >>> print s['a']['b']

SimRank is a vertex similarity measure. It computes the similarity between two nodes on a graph based on the topology, i.e., the nodes and the links of the graph. To illustrate SimRank, let’s consider the following graph, in which a, b, c connect to each other, and d is connected to d. How a node a is similar to a node d, is based on how a‘s neighbor nodes, b and c, similar to d‘s neighbors, c.

    +-------+
    |       |
    a---b---c---d

As seen, this is a recursive definition. Thus, SimRank is recursively computed until the similarity values converges. Note that SimRank introduces a constant r to represents the relative importance between in-direct neighbors and direct neighbors. The formal equation of SimRank can be found here.

The following function takes a networkx graph $G$ and the relative imporance parameter r as input, and returns the simrank similarity value sim between any two nodes in G. The return value sim is a dictionary of dictionary of float. To access the similarity between node a and node b in graph G, one can simply access sim[a][b].

    def simrank(G, r=0.9, max_iter=100):
      # init. vars
      sim_old = defaultdict(list)
      sim = defaultdict(list)
      for n in G.nodes():
        sim[n] = defaultdict(int)
        sim[n][n] = 1
        sim_old[n] = defaultdict(int)
        sim_old[n][n] = 0

      # recursively calculate simrank
      for iter_ctr in range(max_iter):
        if _is_converge(sim, sim_old):
          break
        sim_old = copy.deepcopy(sim)
        for u in G.nodes():
          for v in G.nodes():
            if u == v:
              continue
            s_uv = 0.0
            for n_u in G.neighbors(u):
              for n_v in G.neighbors(v):
                s_uv += sim_old[n_u][n_v]
            sim[u][v] = (r * s_uv / (len(G.neighbors(u)) * len(G.neighbors(v))))
      return sim

    def _is_converge(s1, s2, eps=1e-4):
      for i in s1.keys():
        for j in s1[i].keys():
          if abs(s1[i][j] - s2[i][j]) >= eps:
            return False
      return True

To calculate the similarity values between nodes in the above graph, you can try this.

    >> G = networkx.Graph()
    >> G.add_edges_from([('a','b'), ('b', 'c'), ('c','a'), ('c','d')])
    >> simrank(G)

You’ll get

    defaultdict(<type 'list'>, {'a': defaultdict(<type 'int'>, {'a': 0, 'c': 0.62607626807407868, 'b': 0.65379221101693585, 'd': 0.7317028881451203}), 'c': defaultdict(<type 'int'>, {'a': 0.62607626807407868, 'c': 0, 'b': 0.62607626807407868, 'd': 0.53653543888775579}), 'b': defaultdict(<type 'int'>, {'a': 0.65379221101693585, 'c': 0.62607626807407868, 'b': 0, 'd': 0.73170288814512019}), 'd': defaultdict(<type 'int'>, {'a': 0.73170288814512019, 'c': 0.53653543888775579, 'b': 0.73170288814512019, 'd': 0})})

Let’s verify the result by calculating similarity between, say, node a and node b, denoted by S(a,b).

S(a,b) = r * (S(b,a)+S(b,c)+S(c,a)+S(c,c))/(2*2) = 0.9 * (0.6538+0.6261+0.6261+1)/4 = 0.6538,

which is the same as our calculated S(a,b) above.

For more details, you may want to checkout the following paper:

G. Jeh and J. Widom. SimRank: a measure of structural-context similarity. In KDD’02 pages 538-543. ACM Press, 2002.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I was wondering how can we can use the python module networkX to implement

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply