I have a file from which I am reading the data.
I need advice on how to design the data structure which does the following:
So, the data is of form
id_1::id_2::similiarity_score
Now, though the data is in this form but it also means that
id_2::id_1::same_similiarity_Score
So, what I want is a datastructure which when I use in program.
So lets say I want to use this data in order to find which two items are similar
object.maxSimiliarity(object_id_1)
returns object_id_2 # has max score
but then this object_id_1 can also be in product_id_2 column in the database…
so in database in can be either of form:
object_id_1:: object_id_2::score
or object_id2::object_id_1::score
so I sort off want to design this datastructure in a way that
k_1, k_2:: value <--> k_2,k_1::value
It seems to me that you could use the scores to build lists of best to worst matches:
If the similarity scores need to be retained, use a list of tuples in the form
(id_best_match_to_id1, similarity_score_to_id1).I don’t see a way to exploit that similarity is a symmetric relation where
sim(x,y)==sim(y,x).