I’m trying to understand a para in my AI textbook, and need help with this.
Essentially, my question is why are there 2^(2^n) functions on n attributes if it takes 2^n bits to define a function?
Here is the para from the text (source: AI: A Modern Approach, Stuart Russell and Peter Norvig):
Decision Trees are good for some kinds
of functions and bad for others. Is
there any kind of representation that
is efficient for all kinds of
functions? Unfortunately, no. We can
show this in a very general way.
Consider the set of all Boolean
functions on n attributes. How many
different functions are in this set?
This is just the number of different
truth tables that we can write down,
because the function is defined by its
truth table. The truth table has 2^n
rows, because each input case is
described by n attributes. We can
consider the ‘answer’ column of the
table as a 2^n-bit number that
defines the function. No matter what
representation we use for functions,
some of the functions (almost all of
them, in fact) are going to require at
least that many bits to represent.If it takes 2^n bits to define the
function, then there are 2^(2^n)
different functions on n attributes.
A second question is: Why do we need 2^n bit number (see bold above), I thought we’d need n bit number only, for example if we have 3 attributes, we can define 2^3=8 functions, thus needing only 3 bits to define all 8 functions (000, 001, 010, 011, etc).
i’ve been thinking about this for awhile, not sure what eludes me, thank you for your time in looking into this!
I think I get it, and I think there might be a mistake in your answer…
Let me explain according to my understanding of your example for 3 attributes..
n = 3
Row 1 000
Row 2 001
Row 3 010
…
Row 8 111
Function 0 : False for every row therefore 0 0 0 0 0 0 0 0 (8 ‘0’s as there are 8 rows)
Function 1: True for row 1, false for the rest: 00000001
Function 2: True for row 2, false for the rest: 00000010
…
Thus there are 2^8 functions, which is 2^(2^3) i.e. 2^(2^n).
Correct?