I have a huge graph that I would like to process using many machines.
I had like to compute if the graph diameter is higher than 50.
How would I split the data and I would I write a parallel algorithm that can calculate it?
(the return value is boolean)
The graph diameter is the greatest distance between any pair of vertices
The standard way to figure this out would be an all-pairs shortest path algorithm — the Floyd-Warshall algorithm is a good place to start. Another option using Hadoop is located here.