WPI Worcester Polytechnic Institute

Computer Science Department
------------------------------------------

CS539 Machine Learning - Spring 2003 
Project 6 - Instance-Based Learning and Regression Methods

PROF. CAROLINA RUIZ 

Due Date: Thursday, March 20 2003 at 8 am. 
------------------------------------------


PROJECT DESCRIPTION

Use Instance-based Learning and Regression techniques to construct classifiers for each of the following problems:

  1. Predicting the class attribute (CARAVAN Number of mobile home policies) in the The Insurance Company Benchmark (COIL 2000) dataset.

  2. Predicting a numeric attribute of your choice in the The Insurance Company Benchmark (COIL 2000) dataset.

  3. Predicting whether the income of a given person is >50K or <= 50K using the census-income dataset from the US Census Bureau which is available at the Univ. of California Irvine Repository.

  4. Predicting a numeric attribute of your choice in the census-income dataset.

PROJECT ASSIGNMENT

  1. Read Chapter 8 of the textbook about Instance-based Learning in great detail.

  2. Read the code of the Instance-based Learning and Regression techiques implemented in the Weka system. Some of those techniques are enumerated below:

    • Instance-based Learning:
      • IB1: nearest neighbor classification
      • IBk: k-nearest neighbors classification

    • Regression:
      • Classification via Regression
      • Linear Regression
      • LWR: Locally Weighted Regression
      • Additive Regression (optional)
      • Regression by Discretization

  3. The following are guidelines for the construction of your Instance-based and Regression Classifiers:

REPORT AND DUE DATE