I have some graph databases (friends networks, purchasing history, etc.) that I persist with Neo4j. I plan to analyze these with community detection algorithms such as Girvan Newman. These algorithms usually return a dendrogram, representing the division of the graph from whole network to individual nodes. I am wondering how I might persist these results. I suppose it could be stored as a separate graph, but is there a way to store it within the graph itself? My concern in doing so is the need for creating nodes to represent the groups, which is something I would like to avoid.
I have some graph databases (friends networks, purchasing history, etc.) that I persist with
Share
One way to represent a dendrogram is as a list of pairs, containing (n-1) pairs for n elements. Assuming the left element of the pair is the one whose ID is kept to refer to all elements in a community, a sample dendrogram might look like
So an alternative way to persist that might be to store at each node at which time step it is merged into another node (together with all the nodes that have been previously merged into it).
So you’d attach (0:0) to 1, (1:2) to 3 and (2:0) to 2 (timestep:new ‘name’ of node).
edit: Concretely, this might mean attaching two integer-valued attributes e.g. ‘merge_timestep’ and ‘merge_into’ to each Neo4J node object.