I love the nested-set model for storing hierarchical data, and I’d like to find a similar model for storing task dependencies in a project management application.
Issue 1: Unsustainable complexity of recursive database queries / function calls:
Right now, I have a simple m:n table that stores Task/Blocker pairs, but looping through the data is unoptimized at best, and a recursive nightmare at worst. I’d like to limit database calls in a tight loop, and–with a “normal” tree–I’d use a nested set to accomplish this.
Issue 2: multiple inheritance, multiple descendance
The reason I can’t use a tree is that this set contains not only branches, but also merges. Some tasks have multiple “parent nodes”–if you will–multiple tasks that have to be completed before it can start. It seems similar to how I assume SVN or Git must work to store versioning information.
I want to run queries like:
- Get all tasks, recursively, dependent on a specific task (top-down traversal)
- Add all time estimates for a specific task and all its dependencies (bottom-up traversal)
- Constrain list of potential dependencies for a task to logical options (can’t depend on itself, in a loop)
Possible options so far:
- Bight the bullet and deal with the complexity
- Index all the sequences (“routes”, if you will, to finish all tasks)–still unsure how to store this
- Store both top-down nested sets and bottom-up nested sets
- Pray that some StackOverflow guru knows more than I do on the subject
What’s the best way (and how sure are you that it will work)?
I would start with tasks and task_relationships tables.
Use the task_relationships table for multiple parents or children.
task relationship has parent and child fields both are id’s in the tasks table.