I am identifying loops in directional graphs. My function returns a list of lists which store the nodes in any loops found.
For instance in a graph where the nodes are connected like this:
(1,2)(2,3)(3,4)(3,5)(5,2)
a loop is found at 2 – 3 – 5 so the function would return:
[[2,3,5]]
There are occasions where there are multiple loops which would return something like:
[[2,3,4][6,7,8,9]]
This is great, but if there are multiple start points in a graph which join the same loop at different points, such as in the graph:
(1,2)(2,3)(3,4)(3,5)(5,2)(6,3)
both nodes 1 and 6 join the same loop at different points which would return:
[[2,3,5][3,5,2]]
So here there are two identical loops, which are not identical lists. I want to identify such duplication and remove all but one (it doesn’t matter which).
Note, there may be cases where there are multiple loops, one which is duplicated, such as:
[[2,3,5][3,5,2][7,8,9,6]]
I’ve tried looking into itertools:
loops.sort()
list(loops for loops,_ in itertools.groupby(loops))
but that’s not helped, and I’m not 100% sure that this is appropriate anyway. Any ideas? I’m on python 2.4. Thanks for any help.
If you only care about the elements of each loop, and not the order, I would canonicalize each loop by sorting it, and then take the set:
In order to use
sethere you need to convert to a tuple. You could convert the tuples back to lists, or turn the final set back into a list (maybe even usingsortedto get a canonical order), but whether you’d actually need to would depend upon what you’d be doing with it.If you need to preserve path order, I’d canonicalize in a different way:
and then
[Edit: note that this simple canonicalization only works if each vertex can only be visited once in a path.]