I’m trying to get my head around Cassandra’s DB model, but I’m not having much luck. Almost all of the documentation out there is either explaining Twissandra, a twitter-clone, but it’s actually a fairly simple case, and doesn’t really help with learning how to use Cassandra effectively; or it’s very basic low level stuff that doesn’t really show what you’d typically do with columns/super-columns in a real world situation.
So I thought a forum would be a sufficiently complex application to learn from. Assume a web-forum somewhat like phpBB; you can have multiple Forums, each of which displays multiple Topics, and every topic has multiple Posts to it.
I first thought of having separate Posts, Topics and Forums columns, but that seemed no-different to the way in which I’d implement this on an RDBMS.
So now I’m wondering how deep the nesting can be in super-column families. Is something like the following pseudo-model an appropriate way of modelling this?
Forums = {
forum001: {
name: "General News",
topics: {
topic000001: {
subject: "This is what I think",
date: "2012-08-24 10:12:13",
posts: {
post20120824.101213: { username: "tom", content: "Blah blah", datetime: "2012-08-24 10:12:13" }
post20120824.101513: { username: "dick", content: "Blah blah blah", datetime: "2012-08-24 10:15:13" }
post20120824.103213: { username: "harry", content: "Blah blah", datetime: "2012-08-24 10:32:13" }
}
},
topic000002: {
subject: "OMG Look at this",
date: "2012-08-24 10:42:13",
posts: {
post20120824.104213: { username: "tom", content: "Blah blah", datetime: "2012-08-24 10:42:13" }
post20120824.104523: { username: "dick", content: "Blah blah blah", datetime: "2012-08-24 10:45:23" }
post20120824.104821: { username: "harry", content: "Blah blah", datetime: "2012-08-24 10:48:21" }
}
}
}
},
forum002: {
name: "Specific News",
topics: {
topic000003: {
subject: "Whinge whine",
date: "2012-08-24 10:12:13",
posts: {
post20120824.101213: { username: "tom", content: "Blah blah", datetime: "2012-08-24 10:12:13" }
post20120824.101513: { username: "dick", content: "Blah blah blah", datetime: "2012-08-24 10:15:13" }
}
}
}
}
}
I know these rules don’t give you a model, but your model is completely query-dependent. It’s a different way of looking at the world than an RDBMS. If you really need ad hoc query support (many use cases really don’t), Cassandra isn’t the right choice.