In the past i would use the following simple data model to implement security into my applications. I would have an implicit deny, explicit allow and explicit deny which the later could overide the one preceding it.

The problem i’m facing now is that my current application needs to be scalable to tens of millions of records. In the data model above, for each record i could possibly have many ACL entries. So if my records table (acltest_uspaceData) has 10 million rows my acl security tables could have a lot more.
Is this an appropriate data model for the amount of records we plan on storing? If not, What can i do to this datamodel to make it scalable?
I’m using PostGIS as a Database.
Update – Rules:
- It’s should be implied that nobody has access to anything if there
are not security associations made for that record. - To allow user or group access, you must explicitly Allow them to access
that record via some security association. - To deny someone access who already has access via group membership,
you must explicitly deny that person access to that record via a security association.
Based on your security rules, the
acltest_role_acl.allowedlooks redundant – if a role ACL entry is present thenallowed == truecan be implied. Unless, of course, you actually want the ability to “revoke” whole groups and not just individual users.Other than that, your model looks good and you should concentrate on storing it efficiently in the database.
In case your DBMS supports it, it is probably a good idea to store ACLs as a clustered table:
{record_id, user_id}for user ACLs and{record_id, role_id}for role ACLs.{user_id, record_id}and{role_id, record_id}).If your DBMS supports it, you could compress the leading edge of the clustering index (under Oracle, this is known as COMPRESSED INDEX ORGANIZED table). This would remove some of the redundancy stemming from repeated ACLs (from your comments: “it is common to have many people with the same permissions” and “there are many rows with the same permissions”).
On the other hand, if you are primarily worried about storage cost (and/or use a DBMS that does not support compression), but don’t mind increased complexity and potential performance impact, you could consider something like this:
Since many records share the same ACLs (as per your comment), this model would avoid most of the repetition.
For example, if N records have the same 5 ACL entries:
acltest_uspaceData.acl_group_id).The problem is: when you insert a new record, how do you find the “matching” ACL group? This could be solved roughly like this, but would not be terribly efficient:
First, you create a temporary table for user and role ACLs (
<user_acl_tmp>and<role_acl_tmp>), then insert the ACL entries the new record should have into them, then execute the above query which searches the existing ACL groups for set equality with temporary ACLs.