Can someone explain the solution of this problem to me?
Suppose that you are given a sequence of n elements to sort. The input sequence
consists of n=k subsequences, each containing k elements. The elements in a given
subsequence are all smaller than the elements in the succeeding subsequence and
larger than the elements in the preceding subsequence. Thus, all that is needed to
sort the whole sequence of length n is to sort the k elements in each of the n=k
subsequences. Show an n lg k lower bound on the number of comparisons
needed to solve this variant of the sorting problem.
Solution:
Let S be a sequence of n elements divided into n/k subsequences each of length k
where all of the elements in any subsequence are larger than all of the elements
of a preceding subsequence and smaller than all of the elements of a succeeding
subsequence.
Claim
Any comparison-based sorting algorithm to sort S must take (n lg k) time in the
worst case.
Proof
First notice that, as pointed out in the hint, we cannot prove the lower
bound by multiplying together the lower bounds for sorting each subsequence.
That would only prove that there is no faster algorithm that sorts the subsequences
independently. This was not what we are asked to prove; we cannot introduce any
extra assumptions.
Now, consider the decision tree of height h for any comparison sort for S. Since
the elements of each subsequence can be in any order, any of the k! permutations
correspond to the final sorted order of a subsequence. And, since there are n/k such
subsequences, each of which can be in any order, there are (k!)^n/k permutations
of S that could correspond to the sorting of some input order. Thus, any decision
tree for sorting S must have at least (k!)^n/k leaves. Since a binary tree of height h
has no more than 2^h leaves, we must have 2^h ≥ (k!)^(n/k) or h ≥ lg((k!)^n/k). We
therefore obtain
h ≥ lg((k!)^n/k) -- unbalanced parens - final one added?
= (n/k) lg(k!)
≥ (n/k) lg((k/2)^k/2)
= (n/2) lg(k/2)
The third line comes from k! having its k/2 largest terms being at least k/2 each.
(We implicitly assume here that k is even. We could adjust with floors and ceilings
if k were odd.)
Since there exists at least one path in any decision tree for sorting S that has length
at least (n/2) lg(k/2), the worst-case running time of any comparison-based sorting
algorithm for S is (n lg k).
Can someone walk me through the steps in code block? Especially the step when lg k! becomes lg((k/2)^k/2).
I’ve reprinted the math below:
Let’s walk through this. Going from line (1) to line (2) uses properties of logarithms. Similarly, going from line (3) to line (4) uses properties of logarithms and the facththat (n / k)(k / 2) = (n / 2). So the trick step is going from line (2) to line (3).
The claim here is the following:
Intuitively, the idea is as follows. Consider k! = k(k – 1)(k – 2)…(2)(1). If you’ll notice, half of these terms are greater than k / 2 and half of them are smaller. If we drop all the terms that are less than k, we get something (close to) the following:
Now, we have that k / 2 ≥ k, so we have that
This is the product of (k / 2) with itself (k / 2) times, so it’s equal to (k / 2)k/2. This math isn’t precise because the logic for odd and even values are a bit different, but using essentially this idea you get a sketch of the proof of the earlier result.
To summarize: from (1) to (2) and from (3) to (4) uses properties of logarithms, and from (2) to (3) uses the above result.
Hope this helps!