We have a 6 node cassandra cluster with a very large number of reads per second and very few writes. The whole application comprise of:
- web app server that uses one cassandra node
- 5 x web service machines each using its own cassandra node (pycassa’s Pool server_list is always one node)
Web app talking to cassandra is doing read and write operations (but very very few, only when somebody actually uses the app UI, which does not happen often). Web service however is very heavily loaded with traffic from 3rd party service. Load balancer directs the traffic to all 5 servers and each servers bombards its own cassandra node (which is physically in yet another server) with lots of get() and multiget() requests. Once in a while a set() is used but that’s like once per 10 thousand reads or something.
Having this kind of usage we decided to use replication factor of 6. If each cassandra has 100% of data, reads should be faster and load should be balanced more evenly. We’ve updated keyspace strategy_options and run nodetool repair on each node to transfer the data. It went OK.
Now the very strange thing: all six cassandra nodes are at a very high CPU usage. It is understood in case of the five nodes that are used by the web service, but we can’t explain why the webapp cassandra node is also consuming that much CPU, as if it was performing a lot of reads. It’s as if the replication didn’t work at all – looks like each cassandra node talks to all other nodes whenever a get() happens and the whole ring is extremely stressed.
I made yet another experiment to prove this, I took down one of the web servers and I was looking at the corresponding cassandra node. After the server went down, I expected the CPU usage on this cassandra node to be near zero, because no other machine points to it. But it wasn’t zero, it dropped slightly but still was at a very high level (60% CPU usage).
We’re using pycassa and we did not manipulate the consistency level so it’s at default ConsistencyLevel.ONE
I hope you see what I mean… If replication factor equals number of nodes in the ring, and read consistency level is default (ONE), then each node should be kind of independent in terms of reads: if no-one is doing any reads from given node, the CPU usage on this node should be minimal, correct? However even if we disconnect the only client that sees the node we still observe high CPU usage as if someone still kept reading from it. Where is this load coming from, how is it possible to investigate what’s going on?
Correct me if I’m wrong but I guess that the load you’re seeing on the nodes in the cluster is the read repairs which happens in the background. When you’re reading ConsistencyLevel.ONE on a node in the cluster the data will be returned immediately and the read will trigger a read repair in the background which will send a digest query to all other replicas for the requested data to assure consistency.
Since replication factor is 6 (all data on all nodes), for every read, read repair requests will be sent to all 6 nodes.
Cassandra read repair