Suppose we have a DAG with edges labeled with numbers. Define the value of a path as the product of the labels. For each (source,sink)-pair I want to find the sum of the values of all the paths from source to sink. You can do this in polynomial time with dynamic programming, but there are still some choices that can be made in how you decompose the problem. In my case I have one DAG that has to be evaluated repeatedly with different labelings. My question is: for a given DAG, how can we pre-compute a good strategy for computing these values for different labelings repeatedly? It would be nice if there was an algorithm that finds an optimal way, for example a way that minimizes the number of multiplications. But perhaps this is too much to ask, I would be very happy with an algorithm that just gives a good decomposition.
Share
Let S be the set of sources, V be the set of vertices of DAG, E be the set of edges, n = |V|, m = |S|, W be an n x n matrix that stores the edge weights, and C be an m x n matrix such that C[i,j] holds, at the end of the algorithm, the sum of values of all paths from i to j.
To simplify the explanation and the correctness proof of the algorithm, I assume that the vertices of the graph from 1 to n are topologically ordered in which nodes 1 to m are the sources. This adds O(|E|+|V|) to the running time of our algorithm:
Here is the pseudocode of the algorithm:
There are a total of O(|E|+|V|) iterations for the two outer loops. Therefore, the running time of the algorithm is O((|V|+|E|).m) assuming that addition and multiplication take constant time. That includes the time for topological sorting.
Proof of correctness: we prove by induction that after the completion of the k-th iteration of the outermost loop, C[i,k] is the sum of values of all paths from i to k for each i in S.
Base Case: obvious for k = 1 (because first element doesn’t have any predecessors)
Induction: Assume C[i,j] is correctly computed for all j < k. All paths from any source i to k has to pass through a predecessor k’ of k. Since we are iterating in the topological order k’ must be smaller than k, and according to our induction hypothesis C[i,k’] is the sum of values of path from i to k’. Moreover, the sum of values of paths from i to k that passes through a specific predecessor k’ is equal to the sum of values of paths from i to k’, i.e., C[i,k’], multiplied by W[k’,k]. Therefore, the sum of values of all paths from i to k is sum of C[i,k’]*W[k’,k] over all predecessors k’ of k.
Same graph structure, different W matrix: If we need to compute matrix C for different graphs that have the same structure but different W, we can do the following: Let C’ be a matrix whose elements are list of 3-tuples. Replace line 6 above with
And then by iterating through the vertices in the topological order and iterating through the tuples in C'[i,k], you can compute C[i,k] without looking at the graph structure. That’s because the tuples implicitly represent the graph structure. In terms of complexity, that’s not any better or worse.