Take twissandra as an example, when a new tweet is received, not only the column family for the tweet need to be updated, but a few other (super) column families need to be updated. Definitely we can do this in sequential, on receiving the new tweet, to update each family one by one. But what if there are a dozen column families out there need to be updated? This doesn’t seem to be an efficient approach to me. Is any other way, either relying on the programming framework (Java eg), or relying on other “utilities” like Message Queue, or a better programming model/design pattern to make this more scalable and maintainable?
Share
Yes, you can use batch mutations to group together a set of updates. For example, in Hector, you can do something like:
See https://github.com/rantav/hector/wiki/User-Guide under “Inserting Multiple Columns with Mutator”.