If you are using the Cassandra distributed key-value store, you will have several Cassandra nodes, and thus several computers. The data doesn’t just sit there, of course, you also have one or more clients programs that communicate with the Cassandra nodes. Computationally intensive work done by the clients might also be distributed over several computers. Should the clients and the Cassandra nodes be separate computers? Is it OK to use the same computer as a Cassandra node and as a Cassandra client? I expect it would work, in the sense of performing correctly, but would there be unacceptable performance problems?
The Cassandra documentation I’ve seen talks in terms that suggest Cassandra nodes and clients should be separate computers, but I’ve not seen an explicit recommendation.
Why do I ask? Why might I want to do that? The application I have in mind does not require that the clients store any data locally, they use Cassandra for all persistent data. Their job is computationally intensive, so the bottleneck is likely to be client CPU processing rather than Cassandra processing. Not also using them as Cassandra nodes seems wasteful.
Also, if each computation (client) node is also a Cassandra node, I can use the Cassandra token of each node (used for distributing Cassandra’s data) to distribute the client computations.
This is a valid setup for certain types of deployments. The most common case where people do this is when running Hadoop jobs against Cassandra. The Cassandra Wiki recommends you run one Hadoop TaskTracker on each node in your cluster. That type of deployment is similar to what you are describing.