I have a n x 2 matrix of integers. The first column is a series 0,1,-1,2,-2, however these are in the order that they were compiled in from their constituent matrices. The second column is a list of indices from another list.
I would like to sort the matrix via this second column. This would be equivalent to selecting two columns of data in Excel, and sorting via Column B (where the data is in columns A and B). Keep in mind, the adjacent data in the first column of each row should be kept with its respective second column counterpart. I have looked at solutions using the following:
data[np.argsort(data[:, 0])]
But this does not seem to work. The matrix in question looks like this:
matrix([[1, 1],
[1, 3],
[1, 7],
...,
[2, 1021],
[2, 1040],
[2, 1052]])
You could use np.lexsort:
Note if you pass more than one key to
np.lexsort, the last key is the primary key. The next to last key is the second key, and so on.Using
np.lexsortas I show above requires the use of a temporary array becausenp.lexsortdoes not work on numpy matrices. Sincetemp = data.view(np.ndarray)creates a view, rather than a copy ofdata, it does not require much extra memory. However,is a new array, which does require more memory.
There is also a way to sort by columns in-place. The idea is to view the array as a structured array with two columns. Unlike plain ndarrays, structured arrays have a
sortmethod which allows you to specify columns as keys:Notice that since
temp2is a view ofdata, it does not require allocating new memory and copying the array. Also, sortingtemp2modifiesdataat the same time: