So I’m understanding and learning the candidate elimination process, to find the hypothesis space. I am a little confused with running the CEl on the following example:
We have a cage in which two birds can live in the same cage. Each bird is described with
- sex (Male, Female)
- Color (Red, Green, Blue)
- Origin ( US, Brazil, Russia, Australia, China)
After some iterations I have the S Specific hypothesis and G General Hypothesis as follows:
S << Male, Red, ?>, < Female, Blue, China>>
G << ?,?,?,?>, < ?,?,?,?>>
Now if the training example is:
Negative i.e. can’t live together << Female, Red, US>, < Female, Blue, Australia>>
What will be the new G General hypothesis?
Let me write what I believe could be the answers:
New G
Either:
<< Male, ?, ?>, < ?, ?, ?>> &&
<< ?, ?, ?>, < ?, ?, China>>
Or:
<< Male, ?, ?>, < ?, ?, ?>> &&
<< ?, ?, ?>, < ?, ?, China>> &&
<< ?, ?, ?>, < ?, ?, Russia>> &&
<< ?, ?, ?>, < ?, ?, Brazil>> &&
<< ?, ?, ?>, < ?, ?, US>>
I think 2nd one is correct, because its a general hypothesis and it should be general to include the rest of three countries.
The second one is incorrect because according to the Candidate Elimination Algorithm, for each minimal specialization of G, there must be an hypothesis in S that is more specific. But you have added three specializations for which the single hypothesis in S is not more specific (S requires the second bird to be from China, whereas the last three specializations you added to G require other specific countries).
The first version appears correct.
Note that there is an assumption here that the order of two birds in the hypothesis matters.