I am aware of Twissandra which is an example twitter clone using Cassandra but

Question

0

Asked: May 24, 20262026-05-24T21:44:38+00:00 2026-05-24T21:44:38+00:00

I am aware of Twissandra which is an example twitter clone using Cassandra but

0

I am aware of Twissandra which is an example twitter clone using Cassandra but I was interested to see if anyone has shared a Cassandra schema not to clone Twitter but to use for storing tweets coming through Twitter Streaming API?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-24T21:44:38+00:00

It very much depends what sort of queries you want to do with the data after you have ingested it – I see from your previous question “Dumping Twitter Streaming API tweets…” you probably just want to do big batch processing on it.

If this is the case, you just need to worry about load balancing, making sure each node in the cluster handles 1/n of the write load, and contains 1/n of the data – using the random partition and inserting one row per tweets with the status id as the row key will achieve this.

However, if you want to do queries like “give me all tweets for a given user” you will need a slightly more complicated schema, as the schema suggested above will require you to scan all the data. You could insert multiple tweets per row, the row key being the userid, the column key being the tweet id and the value being the tweet. Then you could use get_slice to answer that query.

A good (somewhat related) blog post: http://blog.insidesystems.net/basic-time-series-with-cassandra

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am aware of Twissandra which is an example twitter clone using Cassandra but

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply