I’m doing some research and I’ve come to a point where I have calculate the clustering coefficient of a graph.
According to this paper directly related to my research:
The clustering coefficient C(p) is
defined as follows. Suppose that a
vertex v has kv neighbours; then at
most (kv * (kv-1)) / 2 edges can
exist between them (this occurs when
every neighbour of v is connected to
every other neighbour of v). Let Cv
denote the fraction of these allowable
edges that actually exist. Define C as
the average of Cv over all v
But this wikipedia article on the subject says differently:
C = (number of closed triplets) / (number of connected triples)
It seems to me that the latter is more computationally expensive.
So really my question is: are they equivalent?
It should be noted that the paper is cited by the Wikipedia article.
Thanks for your time.
I think they’re equivalent. The wiki page you link to gives a proof that the triples formulation is equivalent to the fraction of possible edges formulation when calculating the local clustering coefficient, i.e. calculated just at a vertex. From there it seems that you just need to show that
where
lambda(v)is the number of triangles containing v, andtau(v)is the number of connected triples for which v is the middle vertex, i.e. adjacent to each of the other 2 edges.Now each triangle gets counted three times in the numerator of the LHS. However, each connected triple is only counted once for the middle vertex on the LHS, so the denominators are the same.