I have been playing around with using graphs to analyze big data. Its been

Question

0

Asked: May 31, 20262026-05-31T21:58:24+00:00 2026-05-31T21:58:24+00:00

I have been playing around with using graphs to analyze big data. Its been

0

I have been playing around with using graphs to analyze big data. Its been working great and really fun but I’m wondering what to do as the data gets bigger and bigger?

Let me know if there’s any other solution but I thought of trying Hbase because it scales horizontally and I can get hadoop to run analytics on the graph(most of my code is already written in java), but I’m unsure how to structure a graph on a nosql database? I know each node can be an entry in the database but I’m not sure how to model edges and add properties to them(like name of nodes, attributes, pagerank, weights on edges,etc..).

Seeing how hbase/hadoop is modeled after big tables and map reduce I suspect there is a way to do this but not sure how. Any suggestions?

Also, does this make sense what I’m trying to do? or is it there better solutions for big data graphs?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-31T21:58:25+00:00

You can store an adjacency list in HBase/Accumulo in a column oriented fashion. I’m more familiar with Accumulo (HBase terminology might be slightly different) so you might use a schema similar to:

SrcNode(RowKey) EdgeType(CF):DestNode(CFQ) Edge/Node Properties(Value)

Where CF=ColumnFamily and CFQ=ColumnFamilyQualifier

You might also store node/vertex properties as separate rows using something like:

Node(RowKey) PropertyType(CF):PropertyValue(CFQ) PropertyValue(Value)

The PropertyValue could be either in the CFQ or the Value

From a graph processing perspective as mentioned by @Arnon Rotem-Gal-Oz you could look at Apache Giraph which is an implementation of Google Pregel. Pregel is the method Google use for large graph processing.

Using HBase/Accumulo as input to giraph has been submitted recently (7 Mar 2012) as a new feature request to Giraph: HBase/Accumulo Input and Output formats (GIRAPH-153)

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have been playing around with using graphs to analyze big data. Its been

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply