An upcoming project of mine is considering a design that involves (what I’m calling) ‘abstract entity references’. It’s quite a departure from a more common data model design, but it may be necessary to achieve the flexibility we want. I’m wondering if other architects have experience with systems like this and where the caveats are.
The project has a requirement for a to control access to various entities (logically: business objects; physically: database rows) by various people. For example, we might want to create rules like:
- User Alice is a member of Company Z
- User Bob is the manager of Group Y, which has users Charlie, Dave, and Eve.
- User Frank may enter data for [critical business object] X, and also the [critical business objects] in [critical business object group] U.
- User George is not a member of Company T but may view the reports for Company T.
The idea is that we have a lot of different securable objects, roles, groups, and permissions, and we want a system to handle this. Ideally this system would require little to no coding for new situations once it’s launched; it should be very flexible.
In a ‘traditional’ data design, we might have entities/tables like this:
- User
- Company
- User/Company Cross-Reference
- UserGroup
- User/UserGroup Cross-Reference
- CBO (‘Critical Business Object’)
- User/CBO Cross-Reference
- CBOGroup
- User/CBOGroup Cross-Reference
- CBO/CBOGroup Cross-Reference
- ReportAccess, which is a cross-reference between User and Company specifically for access to reports
Note the big number of cross-reference tables. This system isn’t terribly flexible as any time we want to add a new means of access we’d need to introduce a new cross-reference table; that, in turn, means additional coding.
The proposed system has all of the major entities (User, Company, CBO) reference a value in a new table called Entity. (In the code we’d probably make all of these entities subclasses of an Entity superclass). Then there’s two additional tables that reference Entity * Group, which is also an Entity ‘subclass’. * EntityRelation, which is a relation between two entities of any type (including Group). This will probably also have some sort of ‘Relationship Type’ field to explain/qualify the relationship.
This system, at least at first glance, looks like it would meet a lot of our requirements. We might introduce new Entities down the road, but we’d never need to do additional tables to handle the grouping and relationships between these entities, because Group and EntityRelation can already handle that.
I’m concerned, however, whether this might not work very well in practice. The relationships between entities would become very complex and might be very hard for people (users and developers alike) to understand them. Also, they’d be very recursive; this would make things more difficult for our SQL-dependent report writing staff.
Does anyone have experiences with a similar system?
You’re modeling a set of business rules in the real world that are themselves complex. So it’s not surprising that your model is going to be complex no matter how you do it.
I would recommend that you choose database design that describes the relationships more accurately, instead of trying to be clever. Your clever design may result in fewer tables (though not by an order of magnitude, actually), however you’re trade-off is a lot more application code to manage it.
For example, you already know that it’s going to cause confusion for users and for report designers. Another weakness is making sure the ‘relationship type’ column contains only meaningful strings for the entities involved in the relationship. E.g. it makes sense to say
Bob IsMemberOf UserGroup4, but what does it mean ifCBO CanViewReportsOf Bob? Also how do you prevent mutually exclusive conditions, such asBob IsMemberOf Company1andBob IsMemberOf Company2?You have to write application code to validate the data before inserting it, and after fetching it (because your code can never be sure another part of the code hasn’t introduced a data integrity bug). You may also need to write application code to perform quality control checks on the whole database, and clean up anomalies when they occur.
Compare with a database design in which it’s impossible to enter invalid relationships, because the database metadata includes constraints that prevent it. This would simplify your application code a great deal.
You also identify hierarchical access privileges, like if
Bob CanViewReportsOf Company1, then should he be able to view reports of any UserGroup or CBO that is a member of that company? Or do you need to enter a separate row for every entity’s reports Bob can read? These are policy problems, that will exist regardless of which design you use.To reply to your comments:
I can certainly empathize with byzantine exception-cases and evolving requirements making it hard to design simple solutions.
I worked on systems that tried to model real-world policies that grew so complex that it seemed foolish to try to codify them in software. Ultimately, the client who hired me would have used their money more effectively to hire one or two full-time administrative assistants to track their projects using paper and pencil. New exception cases that took me weeks to implement in software would have taken minutes to describe to the AA.
Automation is harder than doing things manually. The only way automation is justified is if the information needs to be tracked faster, or with higher volume, than a human could do.