I am looking at the following interview question :
Given 2d coordinates , find the k points which are closest to the
origin. Propose a data structure for storing the points and the method to get the k points. Also point out the complexity of the code.
The solution that I have figured out is to save the 2d points in an array. For the first k points, find the distance of each point from the origin and build a max heap. For the remaining points , calculate distance from the origin , say dist. If dist is greater than the topmost element of the heap, then change topmost element of heap to dist and run the heapify() procedure.
This would take O(k) to build the heap and O((n-k)log k) for heapify() procedure , thus the total complexity = O(n log k).
Can anyone suggest a better data structure and/or method , with a possibly better efficiency too ?
EDIT
Would some other data structure be beneficial here ?
I would use order statistics for this one.
Note that we use a modified
SELECTthat uses distance from the origin as the comparison function.A, first element isA[1]and last isA[n].SELECT(A,1,n,k)to find thekth closest element to the origin.A[1..k].One of the benefits of
SELECTthat it partitions the input,so that the smallest
k-1elements are left toA[k].So storing the elements in an array is
O(n).Running
SELECTisO(n).Returning the requested elements is
O(1).