Background:
We are creating a Cassandra Cluster that spans 3 geographically separated Datacenters. We plan to have 2 Cassandra nodes in each data center (2 nodes x 3 sites = 6 nodes total). All the 6 nodes will be the part of same cluster.
The idea is to be able to write data to any node in the cluster and be able to read it from any other node. [We can tolerate 1 second delay in updates].
The Question:
How do we design a client to write to the “cluster”. Cassandra does not have a router or a middle-layer like MongoDB. Do we design so that we write to any node in the ring? If so, what is that node is down (i.e. do we need to make our client aware of all the node IPs in the cluster?)
Thank you.
You can read or write from any node in the cluster, they are all capable of routing the requests to the correct nodes (the node performing the routing is typically referred to as the “coordinator” for an operation). You should try to balance your requests over all nodes in the local datacenter, only using nodes in the remote datacenters if all local nodes are down. Most Cassandra clients will spread requests in a round robin fashion across all of the nodes that you point them at, and as Canausa mentions, some autodiscover other nodes and sometimes use more sophisticated algorithms for picking which node to send a request to.
Writes to any datacenter are automatically replicated to all other datacenters, so you can indeed write to any node and read from any node. Typically, you will want to use consistency level LOCAL_QUORUM for reads and writes, which requires that a quorum of replicas in the local data center respond for the operation to be considered a success. You can also consider writing at EACH_QUORUM, which waits for a response from a quorum of replicas in each datacenter. Obviously, the latency will be much higher in this case, but you can achieve strong consistency across all datacenters.
However, with only 2 nodes in each datacenter, a quorum of replicas is equivalent to all replicas, so if any node goes down, you will lose availability for that portion of the data. For this reason, if you want to use quorum consistency levels, it’s recommended that you have a replication factor of at least 3 in each datacenter, allowing for the loss one replica while still maintaining strong (or locally strong) consistency.