I am using Cassandra to store my data and hive to process my data.

Question

0

Asked: June 18, 20262026-06-18T17:49:47+00:00 2026-06-18T17:49:47+00:00

I am using Cassandra to store my data and hive to process my data.

0

I am using Cassandra to store my data and hive to process my data.
I have 5 machines on which i have set up cassandra and 2 machines I use as analytics node(where hive runs)
So I want to ask is does hive do map reduce on just two machines(analytics nodes) and brings data there or it moves the process/computation to 5 cassandra nodes as well and process/compute the data on those machines.(What I know is in hadoop, process moves to data not data to process).

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-18T17:49:48+00:00

If you interested to marry Hadoop and Cassandra – the first link should DataStax company which is built around this concept. http://www.datastax.com/
They built and support hadoop with HDFS replaced with cassandra.
In best of my understanding – they do have data locality:http://blog.octo.com/en/introduction-to-datastax-brisk-an-hadoop-and-cassandra-distribution/

There is good answer about Hadoop & Cassandra data locality if you run MapReduce against cassandra
Cassandra and MapReduce – minimal setup requirements

Regarding your question – there is a tradeof:
a) If you run Hadoop / Hive on separate nodes you loose data locality and thereof your data throughput is limited by your network bandwidth.
b) If you run hadoop / Hive on the same nodes as cassandra runs – you can get data locality but MapReduce processing behind hive queries might clogg your network (and other resources) and thereof affect your quality of service from cassandra.

My suggestion will be to have separate hive nodes if performance of your cassandra cluster are critical.
If your cassandra is mostly used as a data store and do not handle real-time requests – then running hive on each node will improve performance and hardware utilization.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am using Cassandra to store my data and hive to process my data.

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply