UPDATE: I have realized the problem below is not possible to answer in its current form because of the large amount of data involved (15k+ items). I just found out, the group I’m trying to help just lets it run for a month then terminates it to use the results (which is why they wanted to get more results in a quicker time). This seems insane to me because they are only using the first few sets of data(the last items in the large lists never get used). So I’m revising this question to get a sample of the intended output (approximation of solutions not full solution). What’s the best way to complete this in a smaller amount of time? They seem to want a diverse sample of results, is it genetic algorithms work or some kind of sampling technique? The rest of the question remains the same(same inputs/outputs) but I’m not looking for the complete solution set now(as it will never complete in a lifetime but I hope a parcial list of diverse solutions can).
My problem is not exactly a knapsack problem but its pretty close. Basically, I’m trying to find every combination of X items that equal a specific value. I heard of this problem from a friend of mine who worked in a small school research lab where this process was ran and took about 25 days to complete. Seemed really horrible so I offered to help (benefit to me, is I get to learn and help some really nice people), so I figured out how to scale it by multi-threading it (I’ll include the code below), this cut a few days off their processing time but I still wasn’t satisfied so I just ported my code to work on a GPU but I don’t feel satisfied (although they are happy because its faster and I donated my old video card) because I’m just leveraging hardware and not really any algorithms. Right now, I just brute force the results by checking if the value equals the total and if it does then save the result if it doesn’t then keep processing it.
So with that background, is there anything I can do to speed it up algorithmically? My gut tells me no because since they need every combination it seems logically that the computer has to process every combination (which is several billion) but I’ve seen amazing things here before and even a small speedup can make a difference in days of processing.
I have like over 10 versions of the code but here’s a Java version that uses multi-threading(but the logic between this and gpu is pretty much the same).
Basic logic:
for (int c = 100; c >= 0; c--) {
if (c * x_k == current.sum) { //if result is correct then save
solutions.add(new Context(0, 0, newcoeff));
continue;
} else if (current.k > 0) { //if result is not equal but not end of list then send to queue
contexts.add(new Context(current.k - 1, current.sum - c * x_k, newcoeff));
}
}
Full code:
import java.util.Arrays;
import java.util.ArrayDeque;
import java.util.Queue;
import java.util.concurrent.ConcurrentLinkedQueue;
import java.util.concurrent.LinkedBlockingDeque;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.ThreadPoolExecutor;
import java.util.concurrent.TimeUnit;
public class MixedParallel
{
// pre-requisite: sorted values !!
private static final int[] data = new int[] { -5,10,20,30,35 };
// Context to store intermediate computation or a solution
static class Context {
int k;
int sum;
int[] coeff;
Context(int k, int sum, int[] coeff) {
this.k = k;
this.sum = sum;
this.coeff = coeff;
}
}
// Thread pool for parallel execution
private static ExecutorService executor;
// Queue to collect solutions
private static Queue<Context> solutions;
static {
final int numberOfThreads = 2;
executor =
new ThreadPoolExecutor(numberOfThreads, numberOfThreads, 1000, TimeUnit.SECONDS,
new LinkedBlockingDeque<Runnable>());
// concurrent because of multi-threaded insertions
solutions = new ConcurrentLinkedQueue<Context>();
}
public static void main(String[] args)
{
System.out.println("starting..");
int target_sum = 100;
// result vector, init to 0
int[] coeff = new int[data.length];
Arrays.fill(coeff, 0);
mixedPartialSum(data.length - 1, target_sum, coeff);
executor.shutdown();
// System.out.println("Over. Dumping results");
while(!solutions.isEmpty()) {
Context s = solutions.poll();
printResult(s.coeff);
}
}
private static void printResult(int[] coeff) {
StringBuffer sb = new StringBuffer();
for (int i = coeff.length - 1; i >= 0; i--) {
if (coeff[i] > 0) {
sb.append(data[i]).append(" * ").append(coeff[i]).append(" + ");
}
}
System.out.println(sb);
}
private static void mixedPartialSum(int k, int sum, int[] coeff) {
int x_k = data[k];
for (int c = 0; c <= 100; c++) {
coeff[k] = c;
int[] newcoeff = Arrays.copyOf(coeff, coeff.length);
if (c * x_k == sum) {
//printResult(newcoeff);
solutions.add(new Context(0, 0, newcoeff));
continue;
} else if (k > 0) {
if (data.length - k < 2) {
mixedPartialSum(k - 1, sum - c * x_k, newcoeff);
// for loop on "c" goes on with previous coeff content
} else {
// no longer recursive. delegate to thread pool
executor.submit(new ComputePartialSum(new Context(k - 1, sum - c * x_k, newcoeff)));
}
}
}
}
static class ComputePartialSum implements Callable<Void> {
// queue with contexts to process
private Queue<Context> contexts;
ComputePartialSum(Context request) {
contexts = new ArrayDeque<Context>();
contexts.add(request);
}
public Void call() {
while(!contexts.isEmpty()) {
Context current = contexts.poll();
int x_k = data[current.k];
for (int c = 0; c <= 100; c++) {
current.coeff[current.k] = c;
int[] newcoeff = Arrays.copyOf(current.coeff, current.coeff.length);
if (c * x_k == current.sum) {
//printResult(newcoeff);
solutions.add(new Context(0, 0, newcoeff));
continue;
} else if (current.k > 0) {
contexts.add(new Context(current.k - 1, current.sum - c * x_k, newcoeff));
}
}
}
return null;
}
}
}
Here are some of the characteristics of the data/approach:
- All numbers are shorts (no number seems to exceed a value of +/- 200)
- There are duplicates (but no zero values)
- The for loop limits the coefficients to 100 (this is a hard number and
told it will not change). This bounds the results - There is a limit of number of items but its variable and decided by
my friends lab. I have been testing with 2 pairs combinations but my
friend told me they use 30-35 pairs (it’s not combinations that involve
the entire dataset). This also bounds the results from being out of control - My friend mentioned that the post processing they do involves deleting all results that contain less than 30 coefficients or exceed 35. In my current code I break if the
newcoeffvariable exceeds a number (in this case 35) but maybe there’s a way to not even process results that are below 30. This might be a big area to reduce processing time. as now it seems they generate a lot of useless data to get to the ones they want. - Their dataset is 10k-15k of items (negative/positive)
- I receive only 3 items, two lists (one data and one id numbers to
identify the data) and a target sum. I then save a file with all the
combinations of data in that list. - I offered to help here because this part took the longest time,
before the data comes to me they do something to it(although they do
not generate the data themselves) and once I send them the file they
apply their own logic to it and process it. Thus, my only focus is
taking the 3 inputs and generating the output file. - Using threading and GPU has reduced the problem to complete within a
week but what I’m looking for here is ideas to improve the algorithm
so I can leverage the software instead of just hardware gpu’s to
increase speed. As you can see from the code, its just brute force
right now. So ideally I would like suggestions that are thread-able.
Update2: I think the problem itself is pretty easy/common but the issue is running it at scale so here’s the real data I got when we did a test (it’s not as large as it gets but its about 3,000 items so if you want to test you don’t have to generate it yourself):
private static final int target_sum = 5 * 1000;
private static final List<Integer> data = Arrays.asList( -193, -138, -92, -80, -77, -70, -63, -61, -60, -56, -56, -55, -54, -54, -51, -50, -50, -50, -49, -49, -48, -46, -45, -44, -43, -43, -42, -42, -42, -42, -41, -41, -40, -40, -39, -38, -38, -38, -37, -37, -37, -37, -37, -36, -36, -36, -35, -34, -34, -34, -34, -34, -34, -34, -33, -33, -33, -32, -32, -32, -32, -32, -32, -32, -32, -31, -31, -31, -31, -31, -31, -31, -30, -30, -30, -30, -30, -29, -29, -29, -29, -29, -29, -29, -29, -29, -28, -28, -28, -28, -27, -27, -27, -27, -26, -26, -26, -26, -26, -26, -25, -25, -25, -25, -25, -25, -25, -25, -24, -24, -24, -24, -24, -24, -24, -24, -24, -24, -23, -23, -23, -23, -23, -23, -23, -23, -22, -22, -22, -22, -22, -22, -22, -22, -22, -21, -21, -21, -21, -21, -21, -21, -20, -20, -20, -20, -20, -20, -20, -19, -19, -19, -19, -19, -19, -19, -19, -19, -19, -19, -19, -19, -19, -18, -18, -18, -18, -18, -18, -18, -18, -18, -18, -18, -18, -18, -18, -18, -18, -18, -17, -17, -17, -17, -17, -17, -17, -17, -17, -17, -17, -17, -17, -17, -17, -17, -17, -16, -16, -16, -16, -16, -16, -16, -16, -16, -16, -16, -16, -16, -16, -16, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -14, -14, -14, -14, -14, -14, -14, -14, -14, -14, -14, -14, -14, -14, -14, -14, -14, -14, -14, -14, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -12, -12, -12, -12, -12, -12, -12, -12, -12, -12, -12, -12, -12, -12, -12, -12, -12, -12, -12, -12, -12, -12, -12, -12, -12, -12, -12, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 20, 20, 20, 20, 20, 20, 20, 20, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 23, 23, 23, 23, 23, 23, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 26, 26, 26, 26, 26, 26, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 28, 28, 28, 28, 28, 28, 29, 29, 29, 29, 30, 30, 30, 30, 30, 30, 30, 31, 31, 31, 31, 31, 32, 32, 32, 32, 33, 33, 33, 33, 34, 34, 34, 34, 34, 34, 34, 35, 35, 35, 35, 36, 36, 36, 36, 37, 37, 38, 39, 39, 39, 40, 41, 41, 41, 41, 41, 42, 42, 43, 43, 44, 45, 45, 46, 47, 47, 48, 48, 49, 49, 50, 54, 54, 54, 55, 55, 56, 56, 57, 57, 57, 57, 57, 58, 58, 58, 59, 60, 66, 67, 68, 70, 72, 73, 73, 84, 84, 86, 92, 98, 99, 105, 114, 118, 120, 121, 125, 156);
I’m learning programming and algorithms so if I missed anything or something does not make sense please let me know. please note I understand that this may seem impossible because of the large data but it’s not (I have seen it run and a variation of it has been running for years). Please don’t let the scale distract from the real problem, if you use 10 variables and its 10% faster then my 10 variable brute force then I’m sure it’ll be faster on larger data. I’m not looking to blow the lights out, I’m looking for small improvements (or larger design improvements) that provide even slightly faster results. Also if there’s any assumptions that need to relaxed let me know.
This uses dynamic programming to solve the same problem you gave in the example. It’s been updated to deal with duplicate values by keeping track of the value’s index rather than its value, and to correct a bug which omitted some solutions.