WPI Worcester Polytechnic Institute

Computer Science Department
------------------------------------------

CS 525D KNOWLEDGE DISCOVERY AND DATA MINING - Fall 2009  
Project 2: Classification

PROF. CAROLINA RUIZ 

DUE DATE: Thursday October 22, 2009. ------------------------------------------

This assignment consists of two parts:
  1. A homework part in which you will focus on the construction of the models.
  2. A project part in which you will focus on the experimental evaluation and analysis of the models.

I. Homework Part

[20 points] Calculate the Gain(S,A1) and Gain(S,A2) for the dataset S and attributes A1 and A2 on Slide 8 of the slides used in class to describe the ID3 algorithm. Show each step of the calculation. Include your solution in your written report (and not in your oral report).

II. Project Assignment

THOROUGHLY READ AND FOLLOW THE PROJECT GUIDELINES. These guidelines contain detailed information about how to structure your project, and how to prepare your written and oral reports.