This is a supervised learning problem.
I have a directed acyclic graph (DAG). Each edge has a vector of features X, and each node (vertex) has a label 0 or 1. The task is to find a cost function w(X), so that the shortest path between any pair of nodes has the highest ratio of 1s to 0s (minimum classification error).
The solution must generalize well. I tried logistic regression, and the learned logistic function predicts fairly well the label of a node giving the features of a incoming edge. However, the graph’s topology is not taken into account by that approach, so the solution in the whole graph is non-optimal. In other words, the logistic function is not a good weight function given the problem setup above.
Although my problem setup is not the typical binary classification problem setup, here is a good intro to it:
http://en.wikipedia.org/wiki/Supervised_learning#How_supervised_learning_algorithms_work
Here are some more details:
- Each feature vector X is a d-dimensional list of real numbers.
- Each edge has a vector of features. That is, given the set of edges E = {e1, e2, .. en} and set of feature vectors F = {X1, X2 … Xn}, then edge ei is associated to vector Xi.
- It is possible to come up with a function f(X), so that f(Xi)
gives the likelihood that edge ei points to a node labeled with a 1.
An example of such function is the one I mentioned above, found through logistic
regression. However, as I mentioned above, such function is non-optimal.
SO THE QUESTION IS:
Given the graph, a starting node and an finish node, how do I learn the optimal cost function w(X), so that the ratio of nodes 1s to 0s is maximized (minimum classification error)?
This looks like a problem where a genetic algorithm has excellent potential. If you define the desired function as e.g. (but not limited to) a linear combination of the features (you could add quadratic terms, then cubic, ad inifititum), then the gene is the vector of coefficients. The mutator can be just a random offset of one or more coefficients within a reasonable range. The evaluation function is just the average ratio of 1’s to 0’s along shortest paths for all pairs according to the current mutation. At each generation, pick the best few genes as ancestors and mutate to form the next generation. Repeat until the ueber gene is at hand.