I’ll try to explain the problem in the math language.
Assume I have a set of items X = {x_1, x_2, ..., x_n}. Each item of X belongs to one of the sets S_1, S_2, ..., S_5. I consider all the subsets of X consisting of 5 items:
{x_i1, x_i2, ..., xi5} so x_i1 belongs to S_1, …, x_i5 belogns to S_5.
Some subsets are considered to be correct and some are considered to be not correct. Subset is considered to be correct if it does not contain conflicting items. I have a function f1 to determing if a pair of items conflict or not.
I also have a function f2 which can compare such correct subsets and say which subset is better (they might be equal as well).
I need to find the best not-conflicting subset(s).
Algo I used:
I built all the subsets, discarded not-correct subsets. Then I sorted correct subsets using f2 as a sorting function and took first best subset(s) (I used quick-sort algorithm). As far as there were a huge number of subsets this procedure took insufficient amount of time.
Is there a better approach in terms of time-consumption?
UPDATED
Let’s think of x_i as if it’s interval with integer endpoints. f1 returns true if 2 intervals do not intersect and false otherwise. f2 compares sum lengths of intervals in subsets.
This problem is a variation of maximum weighted interval scheduling algorithm. The DP algorithm has polynomial complexity of
O(N*log(N))withO(N)space for the naive problem, andO(2^G * N * logn(N))complexity withO(2^G * N)space for this variation problem, whereG,Nrepresent the total no of groups/subsets(5 here) & intervals respectively.If x_i doesn’t represent intervals, then the problem is in NP, which other solutions have proved.
First let me explain the dynamic programming solution for maximum weighted interval scheduling, and then solve the variation problem.
start(i),end(i),weight(i)be starting, ending point, interval length of the intervalirespectively.1, 2, ... N.next(i)represent the next interval that doesn’t overlap with intervali.S(i)to be the maximum weighted interval only considering jobsi, i+1, ... N.S(1)is the solution, that considers all jobs from1,2,... Nand returns the maximum weighted interval.S(i)recursively..
Complexity of this solution is
O(N*log(N) + N).N*log(N)for findingnext(i)for all jobs, andNfor solving the subproblems. Space isO(N)for saving subproblem solutions.Now, lets solve variation of this problem.
start(i),end(i),weight(i),subset(i)be starting, ending point, interval length, subset of the intervalirespectively.1, 2, ... N.next(i)represent the next interval that doesn’t overlap with intervali.S(i, pending)to be the maximum weighted interval only considering jobsi, i+1, ... Nandpendingis a list of subsets from which we have to choose one interval each.S(1, {S_1,...S_5})is the solution, that considers all jobs1,...N, chooses one interval for each ofS_1,...S_5and returns the maximum weighted interval.S(i)recursively as follows..
Note that I may have missed some base cases.
Complexity of this algo is
O(2^G * N * logn(N))withO(2^G * N)space.2^G * Nrepresents the subproblem size.As an estimate, for small values of
G<=10and high values ofN>=100000, this algo runs pretty quickly. For medium values ofG>=20,N<=10000should be low as well for this algo to converge. And for high values ofG>=40, the algo doesn’t converge.