I’m looking for a way to generate combinations of objects ordered by a single attribute. I don’t think lexicographical order is what I’m looking for… I’ll try to give an example. Let’s say I have a list of objects A,B,C,D with the attribute values I want to order by being 3,3,2,1. This gives A3, B3, C2, D1 objects. Now I want to generate combinations of 2 objects, but they need to be ordered in a descending way:
- A3 B3
- A3 C2
- B3 C2
- A3 D1
- B3 D1
- C2 D1
Generating all combinations and sorting them is not acceptable because the real world scenario involves large sets and millions of combinations. (set of 40, order of 8), and I need only combinations above the certain threshold.
Actually I need count of combinations above a threshold grouped by a sum of a given attribute, but I think it is far more difficult to do – so I’d settle for developing all combinations above a threshold and counting them. If that’s possible at all.
EDIT – My original question wasn’t very precise… I don’t actually need these combinations ordered, just thought it would help to isolate combinations above a threshold. To be more precise, in the above example, giving a threshold of 5, I’m looking for an information that the given set produces 1 combination with a sum of 6 ( A3 B3 ) and 2 with a sum of 5 ( A3 C2, B3 C2). I don’t actually need the combinations themselves.
I was looking into subset-sum problem, but if I understood correctly given dynamic solution it will only give you information is there a given sum or no, not count of the sums.
Thanks
Actually, I think you do want lexicographic order, but descending rather than ascending. In addition:
I’ll post sample code later.Here’s the sample code I promised, with a few remarks following:
Preface remarks:
This uses a little helper class called Tally, that just isolates the tabulation (including initialization for never-before-seen keys). I’ll put it at the end.
To keep this concise, I’ve taken some shortcuts that aren’t good practice for ‘real’ code:
count. That makes this class non-thread-safe.Explanation:
An instance of
Combosis created with the (descending ordered) array of integers to combine. Thevaluearray is set up once per instance, but multiple calls tocountcan be made with varying population sizes and limits.The
countmethod triggers a (mostly) standard recursive traversal of unique combinations ofnintegers fromvalues. Thelimitargument gives the lower bound on sums of interest.The
countAtmethod examines combinations of integers fromvalues. Theleftargument is how many integers remain to make upnintegers in a sum,startis the position invaluesfrom which to search, andsumis the partial sum.The early-bail-out mechanism is based on computing
best, a two-dimensional array that specifies the ‘best’ sum reachable from a given state. The value inbest[n][p]is the largest sum ofnvalues beginning in positionpof the originalvalues.The recursion of
countAtbottoms out when the correct population has been accumulated; this adds the currentsum(ofnvalues) to thetally. IfcountAthas not bottomed out, it sweeps thevaluesfrom thestart-ing position to increase the current partialsum, as long as:valuesto achieve the specified population, andbest(largest) subtotal remaining is big enough to make thelimit.A sample run with your question’s data:
produces the results you specified:
Here’s the Tally code: