Given a node x in an undirected graph that is known to be part of a connected component, I seek to find all nodes belonging to the component of x.
My current implementation identifies all components in the undirected graph and is therefore inneficient for large graphs. I currently use connectedComp from ggm library to do this, but would rather run a BFS from RBGL starting at node x and terminating once its component is fully explored. Any suggestions on how to do this? Also, any information on parallel graph algorithm implementations that can be called from R would be appreciated.
library("ggm")
x <- 2
> graph
1 2 3 4 5 6 7 8 9 10
1 0 0 0 0 0 0 0 0 0 0
2 0 0 1 0 0 1 0 0 0 0
3 0 1 0 0 0 1 1 1 0 0
4 0 0 0 0 0 0 0 0 0 0
5 0 0 0 0 0 0 0 0 0 0
6 0 1 1 0 0 0 0 0 0 0
7 0 0 1 0 0 0 0 0 0 0
8 0 0 1 0 0 0 0 0 0 0
9 0 0 0 0 0 0 0 0 0 0
10 0 0 0 0 0 0 0 0 0 0
graph_object <- as(graph, "graphNEL")
# All connected components of graph using connectedComp function:
comp_list <- connectedComp(graph_object)
> comp_list
$`1`
[1] "1"
$`2`
[1] "2" "3" "6" "7" "8"
$`3`
[1] "4"
$`4`
[1] "5"
$`5`
[1] "9"
$`6`
[1] "10"
# Extract adjacency matrix of component containing x:
comp_x <- seq_along(comp_list)[sapply(comp_list, FUN=function(list) x %in% list)]
> comp_x
[1] 2
comp_x_list <- comp_list[[comp_x]]
> comp_x_list
[1] "2" "3" "6" "7" "8"
comp_x <- graph[comp_x_list, comp_x_list]
> comp_x
2 3 6 7 8
2 0 1 1 0 0
3 1 0 1 1 1
6 1 1 0 0 0
7 0 1 0 0 0
8 0 1 0 0 0
In my opinion preprocessing graph with Union-find will give you best results.
It would be faster if you store graph as list of edges instead of adjacency matrix.
If you need parallel solution, then you should read about bfs in hadoop