Machine Learning - Summer 2003
Project 2 - Basic concept learning methods
1. Experiments with the Candidate-elimination algorithm
(vs.pl) using the PlayTennis data
(the data file you created for Project 1):
2. Inductive bias:
Extend the PlayTennis data file with taxonomies for the attributes, so
that they can be used as structural. Hint: see Lab
For each class (yes and no) find experimentally the maximal
consistent subset of the PlayTennis data that can be used by the Candidate-elimination
algorithm (vs.pl using the batch query) to create a consistent hypothesis.
For each class (yes and no) find experimentally the minimal
inconsistent subset of the PlayTennis data that can be used by the
Candidate-elimination algorithm (vs.pl using the batch query) to cause
3. Experiments with noise:
Find hypotheses for all target concepts defined in animals.pl
(mammal, fish, reptile, bird, amphibian), loandata.pl
(approve, reject) and PlayTennis (yes, no) data using all
three algorithms: id3.pl, lgg.pl
and search.pl. Include these hypotheses
in you report.
Analyze these hypotheses and identify the inductive bias of the
above algorithms, i.e. write a short description of the type
of hypotheses each algorithm tends to create (short/long, general/specific,
Documentation and submission: Include the results of all the work
you have done according to the above specifications in a document (Word
or HTML format) with a title page including your name and either send it
as an e-mail attachment or hand it as a hard copy.
Introduce class noise (add example(s) with same attribute values, but changed
class value) in your data (animals.pl,
loandata.pl and PlayTennis).
Analyze the behavior of all three algorithms (id3.pl,
lgg.pl and search.pl)
when dealing with class noise. Are they able to produce a model (tree,
rules) and how is this model affected by the noise? Explain this with the
particular learning technique each algorithm uses.