I am searching on the internet in order to find some algorithm that can traverse a graph in parallel using 2 or n processes without one process stepping into a previously visited node of the other so I can speed up the total scanning task of the whole graph, but I can’t find anything. Is there any algorithm that can help me do such task in parallel? is it worth it?
Note :
n processes share the same memory of visited and tovisit nodes
thank you
You can try the consumer-producer model for traversing the graph – but with some modifications from the pure model:
visitedset in blocks. It will save you the synchronization time – which will be needed to be done less frequently.visitedset) – you should do some extra work to make sure you don’t add data that was already visited since the set was last checked.Note that with this approach – you are more then likely to search some vertices a few times – but you can bound it with the frequency the queue and
visitedset are updated.Will it worth it? It is hard to say in these things – it is dependent on a lot of things (graph structure, size, queue implementation, …).
You should run a few tests and try to fine tune the parameter for “how often to update”, and check which is better empirically. You should use statistical tools (wilcoxon test is the de-facto standard for this usually) and determine if one is better then the other.