### CS4341 Introduction to Artificial Intelligence  Project 4 - C 2002

DUE DATE: Tuesday, Feb. 26 at 9 pm.

#### PROJECT DESCRIPTION

This project/homework consists of 3 parts. You must submit a solution for each of these parts. You are allowed to work in groups of 2 students or individually.
1. Neural Networks
Construct a learning system for face recognition using neural networks and the error back propagation procedure. This project is based on the
source code and dataset provided online as a companion to Chapter 4 of Tom M. Mitchell's "Machine Learning" textbook (McGraw-Hill, 1997). For your convenience, here is a PDF version of the code documentation.

2. Neural Networks + Genetic Algorithms
Describe an approach to train neural networks using a genetic algorithm instead of the error backpropagation algorithm.

3. Decision Trees
Construct a decision tree over a dataset.

#### PROJECT ASSIGNMENT

1. Neural Networks

This part of the project consists of three subparts:

1. Sunglasses Recognizer. Train a neural network to recognize whether the person in a picture is wearing sunglasses.

2. Face Recognizer. Train a neural network to recognize who the person in a picture is among a group of 20 possible people.

3. Pose Recognizer. Train a neural network to recognize whether the person in a picture is looking up, straight, left, or right.

You must follow the guidelines below for the training of your neural nets:

2. Neural Networks + Genetic Algorithms

Consider the following approach to training a neural net that uses genetic algorithms instead of the error backpropagation algorithm. For simplicity, assume that we are training a 2-layer, feedforward neural net with 4 inputs, 1 hidden layer with 2 hidden nodes, and one output node. We have a collection of say n training examples to train the net.

• Each individual in the population encodes a value for the weight of each and everyone of the connections in the neural net.
• The fitness of an individual must represent how close the output computed by the net is to the desired output over all the training examples. The higher the fitness of an individual, the better the corresponding neural net computes the target function.
• Cross-over is done at the weight level. That is, cross-over points will be restricted to occur only within (the codification of) two weights but not inside (the codification of) one weight.

Solve the following problems:

1. Describe precisely the bit-string encoding of individuals.
2. Suppose that the initial population consists of 100 individuals. Describe in detail how you would obtain those initial individuals.
3. Describe precisely the fitness function.
4. Describe precisely the selection, mutation, and cross-over operators.
5. Describe precisely the genetic algorithm (or sequence of steps) that you would follow to train the neural net using this evolutionary approach. What are the termination conditions of the algorithm?
6. Describe precisely what the output of your algorithm is.

3. Decision Trees

Consider the following dataset that specifies the type of contact lenses that is prescribed to a patient based on the patient's age, spectacle prescription, astigmatism, and tear production rate. Use information theory, more precisely entropy, to construct a minimal decision tree that predicts the type of contact lenses that will be prescribed to a patient based on the patient's attributes.
SHOW EACH STEP OF THE CALCULATIONS and the resulting tree.

```  Attribute age                  values: {young, pre-presbyopic, presbyopic}
Attribute spectacle-prescrip   values: {myope, hypermetrope}
Attribute astigmatism          values: {no, yes}
Attribute tear-prod-rate       values: {reduced, normal}
Attribute contact-lenses       values: {soft, hard, none}

age              spectacle-     astigmatism tear-prod-rate contact-lenses
prescription

young            myope          no          reduced         none
young            myope          no          normal          soft
young            myope          yes         reduced         none
young            myope          yes         normal          hard
young            hypermetrope   no          reduced         none
young            hypermetrope   no          normal          soft
young            hypermetrope   yes         reduced         none
young            hypermetrope   yes         normal          hard
pre-presbyopic   myope          no          reduced         none
pre-presbyopic   myope          no          normal          soft
pre-presbyopic   myope          yes         reduced         none
pre-presbyopic   myope          yes         normal          hard
pre-presbyopic   hypermetrope   no          reduced         none
pre-presbyopic   hypermetrope   no          normal          soft
pre-presbyopic   hypermetrope   yes         reduced         none
pre-presbyopic   hypermetrope   yes         normal          none
presbyopic       myope          no          reduced         none
presbyopic       myope          no          normal          none
presbyopic       myope          yes         reduced         none
presbyopic       myope          yes         normal          hard
presbyopic       hypermetrope   no          reduced         none
presbyopic       hypermetrope   no          normal          soft
presbyopic       hypermetrope   yes         reduced         none
presbyopic       hypermetrope   yes         normal          none
```

#### REPORT AND DUE DATE

This project is due on Tuesday, February 26 at 9:00 pm. The submission should be done using the
turnin program.

1. Neural Networks.
• If you develop your own error backpropagation code, you should submit the source code of your program documented using the Departmental Documentation Standard.

• A file proj4_ann.txt with your written report for this part of the assignment. Your report should discuss the following issues:
1. answers to all questions in Part I of Mitchell's assignment (for your convenience, here is a PDF version of the code documentation and of Mitchell's assignment),
2. adaptation of the code (if any) or a description of your own code,
3. the experiments you ran with the system,
4. the topology (number of units in each hidden layer), number of iterations of the error backpropagation algorithm, and all other input parameters that you used for each of your experiments as well as the output printed out by the error backpropagation code, including the accuracy of each of the neural nets,
5. strengths and the weaknesses of the system.
Your report should also include a short user manual explaining how to install, run, and use your system (if different from the CMU package).

2. A file proj4_ann_ga.txt with your answers to the questions in the Neural Networks + Genetic Algorithms part of the assignment.

3. A file proj4_dt.txt with your answers to decision trees part of the assignment.

1. Neural Networks
```
Sunglasses	(Q1-Q4)		20 points

Obtaining results	3
Q4:
Code Modifications	5
Classification Accuracy	5
# of Epochs		5
Validation set		1
Test set		1

Face Recognition (Q5-Q8)	30 points

Obtaining results	3
Q8:                     7

Q7:
code modifications	8
#output nodes and
output convention	8
Class Accuracy		1
# Epochs		1
Validation set		1
Test set		1

Pose Recognition (Q9-Q11)	30 points

Obtaining results	10
Code modification	6
Output endoding		6
# epochs		1
Validation set		1
Test set		1

Visualization	(Q12-Q13)	20 points
Q13(a)			10
Q13(b)			10

Report				20 points
Q2-Q4			16
Q5			4
```

2. Neural Networks + Genetic Algorithms 30 points

3. Decision Trees 30 points
```Level 1 of the tree (4 attributes): 15 points
For each attribute
- using the right formula and taking into
consideration all values of the attribute
and of the target attribute                     2
- right calculations                              1
Selecting the right attribute (least entropy):     3
(this selection will be considered correct if
the attribute w/ least entropy is chosen even if
the calculations of the entropies are wrong).

Level 2 of the tree (3 attributes): 12 points
For each attribute (same as above)             2+1=3
Selecting the right attribute (least entropy):     3

Level 3 of the tree (2 attributes): 09 points
For each attribute (same as above)             2+1=3
Selecting the right attribute (least entropy):     3

Level 4 of the tree (1 attribute, in each case): 4 points
Selecting the right attribute (least entropy):     2

```

This adds up to 40 points: 30 points + 10 extra-credit points

4. Total: 180 points