I have created the code to perform a mapping of points from one vector to another based on Euclidean distance and have checked that it works correct.
However it is taking too much time. Essentially I have created a matrix for euclidean distance of A and B vectors and found the minimum of it. After I denote the mapping of these points I delete the row and column from the euclidean matrix by marking them as NaN for the next mapping to occur.
Can this code be more efficient because it is awfully slow right now…
Euclid = distance(A,B); % calculates euclid distance column v/s column wise.
for var = 1 : value
%# finds the min of Euclid and its position, when Euclid is viewed as a 1D array
[~, position] = min(Euclid(:));
%#transform the index in the 1D view to 2 indices
[i,j] = ind2sub(size(Euclid),position);
%display(strcat(num2str(i),32, num2str(j)));
mapping = [A(1,i) A(2,i) B(1,j) B(2,j)];
fprintf(FID,'%d %d %d %d\n', mapping );
Euclid( i , : ) = NaN;
Euclid( : , j ) = NaN;
%counter = counter + 1;
end
The problem is that for a 5000 X 5000 matrix the code just hangs for a long time…
Can somebody help me out…
I suppose something to try is reshaping the array of distances into a 1-D array, where you keep careful record of how that new 1-D index will map back into the 2-D index. Then just call a sorting function on the 1-D array, to sort it into ascending order. Make sure to save the permutation of the indices that causes the sort, and then read-off the 2-D array coordinates that correspond to the 1-D coordinates in the sort permutation. In this method, you’ll be at the mercy of Matlab’s sort algorithms, and you’ll need to keep careful track of the index changes from reshaping.
This also poses a problem for removing points from consideration. You would have to keep a running list of indices for points that have already been selected, and if (as you traverse the 1-D permutation) you encounter something that corresponds to an index already selected, then you just skip it (e.g. you don’t set it to NaN, you just skip it from consideration).
This saves you from needing to compute a minimum every time. The only real checking at each iteration of a for-loop that traverses the 1-D sort permutation is the logical check whether the current location has been selected before (due to one of its points being involved in a smaller distance location). On iteration
i, this check takes at mosti-1comparisons, so this will be slightly less than O(n^2).This will be faster than your current method, which calculates the min of the whole array every time, but still not as fast as the O(n log n) method mentioned in the link below.
More generally, you want to read the papers linked in the answers to this question. This is not a trivial problem, not well suited for the RAM intensive Matlab sessions, and probably will require you to rewrite your whole approach to this.
Another idea is that if you broadcast all of the array
Ato a bunch of processors, then this is embarrassingly parallel on the points in arrayB. You can ship out pieces ofBto different processors, and get back a list of all the distances. You’ll have to do some processing back on the head node to select out the index for the min-distance and remove those points, but not too much. Probably you can re-write that part anyway so that you don’t need to compute the distances multiple times.So if you write this in, say, Python or C++, you could quickly use the MPI library and then either run it on a cloud cluster at Amazon Web Services, or on a local cluster if you have access.