Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8466773
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 10, 20262026-06-10T15:26:28+00:00 2026-06-10T15:26:28+00:00

We are in the process of evaluating whether to move an multitenant EAV system

  • 0

We are in the process of evaluating whether to move an multitenant EAV system built on PostGres to Cassandra and I wanted input on our schema approach to see if a test with Cassandra makes sense. Our multitenant system hierarchy consists of account->app where an account can run multiple apps. Queries need to be segregated by app or by account (aggregating all app data for the account). Accounts can create their own dataobjects with their own custom fields in our EAV model.

There are two approaches that I have considered taking with Cassandra. The first is to hold a certain number of apps (say 20) within 1 column family (to reduce the number of column families used). Each row would be identified by a composite column of accountid->appid->dataobjectid->recordid. Columns would be added on the fly for each app’s dataobject as needed by that app. This means if the column family had two apps, 1 row for the first app may have 20 columns defined while the second app might have 30 columns defined. This would mean there would be a total of 50 potential columns for those two apps. Right now the average number of columns for an app is 19. This means the avg number of columns in a column family would be 400. Seems reasonable and takes advantage of Cassandra’s wide column support. In fact, we could probably easily support more apps per columnfamily. The drawback is that secondary indexes would be difficult as we don’t allow user’s to create their own indexes so queries could not be made more efficient without .

The second approach is to have two columnfamilies to hold all data for say 1000 apps. The first column family would have the same composite column as above, but it would hold the entire dataobject for that row in a JSON document. A second column family would have the same composite key but would add another value to the key which is fieldid that represents the field within the json document (our apps metadata manager stores UUIDs to identify each “field” within a JSON doc), but would have a “fieldvalue” column for each datatype – string, number, decimal, float (dates and bools get converted to numbers). The nice feature here is that we could easily index each of those columns for search purposes and we are minimizing the number of columnfamilies we create.

What are the pros and cons of the two approaches above? Am I missing something obvious or misunderstanding Cassandra in the scenarios above (for example, can I have composite columns that are so wide in the first place)? Are there other, better schema suggestions for an app of this type?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-10T15:26:30+00:00Added an answer on June 10, 2026 at 3:26 pm

    I think the first question you need to answer when deciding on your data model is “how do I intend to query this data?” In general, you are nowhere near the limit in terms of CFs, columns, or number of components in a composite in either model, so I wouldn’t worry about that.

    Considering that you are concerned about the lack of secondaries in your first model, this tells me that query-by-value functionality may be important. If so, the second model might serve you better. The caveat there is that secondaries work best in situations where you have low cardinality, and your data may not fit that case well. If not, you can create your own index quite easily, in which case either model will work.

    My advice is to figure out how you intend to read your data, then plan your model to match your read patterns. If you’re unsure, play around with both models to see which works best. In my experience it often takes more than one iteration to work out a good model, and you should not be afraid to write your data more than one way. Normalization is not the objective here. If you want to discuss your model more in-depth, check out the Cassandra IRC channel on freenode (#cassandra).

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

We're re-evaluating our database upgrade process for our application to try and remove the
We are in the process of re-evaluating our usage of JSF (brought in before
I'm currently evaluating ivy, maven and buckminster to ease our build process. Conceptually buckminster
I am in the process of evaluating whether BigQuery could be a good choice
I am in the process of evaluating FindBugs and am trying to make use
I am planning on introducing Java rules and currently in the process of evaluating
A process to quickly detect whether there is data in a given worksheet or
I am in the process of evaluating several service frameworks and one of them
I am in the process of evaluating an upgrade to Windows SDK 7.1 Part
We are currently in the process of upgrading our rails version and I have

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.