Machine Learning - Project 1

Machine Learning - Summer 2003

Project 2 - Basic concept learning methods

Posted: 6/04/2003
Due: 6/10/2003

1. Experiments with the Candidate-elimination algorithm (vs.pl) using the PlayTennis data (the data file you created for Project 1):

Extend the PlayTennis data file with taxonomies for the attributes, so that they can be used as structural. Hint: see Lab experiments 2.
For each class (yes and no) find experimentally the maximal consistent subset of the PlayTennis data that can be used by the Candidate-elimination algorithm (vs.pl using the batch query) to create a consistent hypothesis.
For each class (yes and no) find experimentally the minimal inconsistent subset of the PlayTennis data that can be used by the Candidate-elimination algorithm (vs.pl using the batch query) to cause a contradiction.

2. Inductive bias:

Find hypotheses for all target concepts defined in animals.pl (mammal, fish, reptile, bird, amphibian), loandata.pl (approve, reject) and PlayTennis (yes, no) data using all three algorithms: id3.pl, lgg.pl and search.pl. Include these hypotheses in you report.
Analyze these hypotheses and identify the inductive bias of the above algorithms, i.e. write a short description of the type of hypotheses each algorithm tends to create (short/long, general/specific, disjoint/overlapping).

3. Experiments with noise:

Introduce class noise (add example(s) with same attribute values, but changed class value) in your data (animals.pl, loandata.pl and PlayTennis).

all

id3.pl

lgg.pl

search.pl

Documentation and submission: Include the results of all the work you have done according to the above specifications in a document (Word or HTML format) with a title page including your name and either send it as an e-mail attachment or hand it as a hard copy.