Note: this is an abstract rewording of a real-life problem regarding ordering records in a SWF file. A solution will help me improve an open-source application.
Bob has a store, and wants to do a sale. His store carries a number of products, and he has a certain integer quantity of units of each product in stock. He also has a number of shelf-mounted price labels (as many as the number of products), with the prices already printed on them. He can place any price label on any product (unitary price for one item for his entire stock of that product), however some products have an additional restriction – any such product may not be cheaper than a certain other product.
You must find how to arrange the price labels, such that the total cost of all of Bob’s wares is as low as possible. The total cost is the sum of each product’s assigned price label multiplied by the quantity of that product in stock.
Given:
- N – the number of products and price labels
- Si, 0≤i<N – the quantity in stock of product with index i (integer)
- Pj, 0≤j<N – the price on price label with index j (integer)
- K – the number of additional constraint pairs
- Ak, Bk, 0≤k<K – product indices for the additional constraint
- Any product index may appear at most once in B. Thus, the graph formed by this adjacency list is actually a set of directed trees.
The program must find:
- Mi, 0≤i<N – mapping from product index to price label index (PMi is price of product i)
To satisfy the conditions:
- PMAk ≤ PMBk, for 0≤k<K
- Σ(Si × PMi) for 0≤i<N is minimal
Note that if not for the first condition, the solution would be simply sorting labels by price and products by quantity, and matching both directly.
Typical values for input will be N,K<10000. In the real-life problem, there are only several distinct price tags (1,2,3,4).
Here’s one example of why most simple solutions (including topological sort) won’t work:
You have 10 items with the quantities 1 through 10, and 10 price labels with the prices $1 through $10. There is one condition: the item with the quantity 10 must not be cheaper than the item with the quantity 1.
The optimal solution is:
Price, $ 1 2 3 4 5 6 7 8 9 10
Qty 9 8 7 6 1 10 5 4 3 2
with a total cost of $249. If you place the 1,10 pair near either extreme, the total cost will be higher.
The problem is NP-complete for the general case. This can be shown via a reduction of 3-partition (which is a still strong NP-complete version of bin packing).
Let w1, …, wn be the weights of objects of the 3-partition instance, let b be the bin size, and k = n/3 the number of bins that are allowed to be filled. Hence, there is a 3-partition if objects can be partitioned such that there are exactly 3 objects per bin.
For the reduction, we set N=kb and each bin is represented by b price labels of the same price (think of Pi increasing every bth label). Let ti, 1≤i≤k, be the price of the labels corresponding to the ith bin.
For each wi we have one product Sj of quantity wi + 1 (lets call this the root product of wi) and another wi – 1 products of quantity 1 which are required to be cheaper than Sj (call these the leave products).
For ti = (2b + 1)i, 1≤i≤k, there is a 3-partition if and only if Bob can sell for 2bΣ1≤i≤k ti:
Thus, the solution has cost 2bΣ1≤i≤k ti (since the total quantity of products with price ti is 2b).
First observe that in any solution were more than 3 root products share the same price label, for each such root product that is “too much” there is a cheaper price tag which sticks on less than 3 root products. This is worse than any solution were there are exactly 3 root products per price label (if existent).
Now there can still be a solution of Bob’s Sale with 3 root labels per price, but their leave products do not wear the same price labels (the bins sort of flow over).
Say the most expensive price label tags a root product of wi which has a cheaper tagged leave product. This implies that the 3 root labels wi, wj, wl tagged with the most expensive price do not add up to b. Hence, the total cost of products tagged with this price is at least 2b+1.
Hence, such a solution has cost tk(2b+1) + some other assignment cost. Since the optimal cost for an existent 3-partition is 2bΣ1≤i≤k ti , we have to show that the just considered case is worse. This is the case if tk > 2b Σ1≤i≤k-1 ti (note that it’s k-1 in the sum now). Setting ti = (2b + 1)i, 1≤i≤k, this is the case. This also holds if not the most expensive price tag is the “bad” one, but any other.
So, this is the destructive part 😉 However, if the number of different price tags is a constant, you can use dynamic programming to solve it in polynomial time.