I have two arrays containing the same elements, but in different orders, and I want to know the extent to which their orders differ.
The method I tried, didn’t work. it was as follows:
For each list I built a matrix which recorded for each pair of elements whether they were above or below each other in the list. I then calculated a pearson correlation coefficient of these two matrices. This worked extremely badly. Here’s a trivial example:
list 1: 1 2 3 4 list 2: 1 3 2 4
The method I described above produced matrices like this (where 1 means the row number is higher than the column, and 0 vice-versa):
list 1: 1 2 3 4 1 1 1 1 2 1 1 3 1 4 list 2: 1 2 3 4 1 1 1 1 2 0 1 3 1 4
Since the only difference is the order of elements 2 and 3, these should be deemed to be very similar. The Pearson Correlation Coefficient for those two matrices is 0, suggesting they are not correlated at all. I guess the problem is that what I’m looking for is not really a correlation coefficient, but some other kind of similarity measure. Edit distance, perhaps?
Can anyone suggest anything better?
Mean square of differences of indices of each element.
Indices of each element of List 1 in List 2 (zero based)
Indices of each element of List 1 in List 1 (zero based)
Differences:
Square of differences:
Average differentness = 8 / 5.