I have a connected, undirected graph G = (V, E), a set S = {S_1, S_2,…, S_n} where each S_i is a subset of V, and a k > 1. How can I partition V into k subsets such that it is guaranteed that:
- for each i, every node in S_i is in the same subset
- each subset represents a connected subgraph of G?
The Steiner forest problem is, given a weighted graph G = (V, E) and pairs of vertices (s1, t1), …, (sm, tm), find the lightest edge-subgraph H of G such that, for all i, vertices si and ti belong to the same connected component of H.
The decision version of your problem is essentially the decision version of Steiner forest with unit weights. Unfortunately, this special case is still NP-hard.
The reduction from the special case of Steiner forest to your problem is, given an unweighted instance of Steiner forest and instructions to determine whether there exists a solution of cost at most c, create an instance of your problem with the same graph, k = |V| – c, and, for all i, let Si = {si, ti}. If there exists a Steiner forest of cost at most c, then the connected components of the forest are your subsets, which number at least |V| – c = k. Conversely, if the instance of your problem has a solution, then we can find a spanning tree within each of your subsets, and the total cost is at most |V| – k = c.
The best approximation ratio known is 2, which won’t help you much if k is small. You’ll probably have to use branch and bound.