Given an unsorted sequence of integers that flows into your program as a stream.
The integers are too many to fit into memory.
Imagine there is a function:
int getNext() throws NoSuchElementException;
It returns the next integer from the stream.
Write a function to find the median.
Solve the problem in O(n).
Any ideas?
Hint is given (use heap the data structure..)
See this paper. It will (likely) take more than one pass. The idea is that in each pass upper and lower bounds are computed such that the median lies between them.
A fundamental result here is N = size of data, P = number of passes
Theorem 2) A P-pass algorithm which selects the Kth highest of N elements requires
storage at most O(N(1/P)(log N)(2-2/P)).
Also, for very small amounts of storage S, i.e., for 2 <= S <= O((log N)2), there is a class of selection algorithms which use
at most O((log N)3/S) passes.
Read the paper. I’m not really sure what the heap has to do with it