The problem
I have multiple groups which specify the relationships of symbols.. for example:
[A B C]
[A D E]
[X Y Z]
What these groups mean is that (for the first group) the symbols, A, B, and C are related to each other. (The second group) The symbols A, D, E are related to each other.. and so forth.
Given all these data, I would need to put all the unique symbols into a 1-dimension array wherein the symbols which are somehow related to each other would be placed closer to each other. Given the example above, the result should be something like:
[B C A D E X Y Z]
or
[X Y Z D E A B C]
In this resulting array, since the symbol A has multiple relationships (namely with B and C in one group and with D and E in another) it’s now located between those symbols, somewhat preserving the relationship.
Note that the order is not important. In the result, X Y Z can be placed first or last since those symbols are not related to any other symbols. However, the closeness of the related symbols is what’s important.
What I need help in
I need help in determining an algorithm that takes groups of symbol relationships, then outputs the 1-dimension array using the logic above. I’m pulling my hair out on how to do this since with real data, the number of symbols in a relationship group can vary, there is also no limit to the number of relationship groups and a symbol can have relationships with any other symbol.
Further example
To further illustrate the trickiness of my dilemma, IF you add another relationship group to the example above. Let’s say:
[C Z]
The result now should be something like:
[X Y Z C B A D E]
Notice that the symbols Z and C are now closer together since their relationship was reinforced by the additional data. All previous relationships are still retained in the result also.
The first thing you need to do is to precisely define the result you want.
You do this by defining how good a result is, so that you know which is the best one. Mathematically you do this by a cost function. In this case one would typically choose the sum of the distances between related elements, the sum of the squares of these distances, or the maximal distance. Then a list with a small value of the cost function is the desired result.
It is not clear whether in this case it is feasible to compute the best solution by some special method (maybe if you choose the maximal distance or the sum of the distances as the cost function).
In any case it should be easy to find a good approximation by standard methods.
A simple greedy approach would be to insert each element in the position where the resulting cost function for the whole list is minimal.
Once you have a good starting point you can try to improve it further by modifying the list towards better solutions, for example by swapping elements or rotating parts of the list (local search, hill climbing, simulated annealing, other).