WPI Worcester Polytechnic Institute

Computer Science Department
------------------------------------------

CS539 Machine Learning 
Assignment 9 - Fall 2000

PROF. CAROLINA RUIZ 

Due: Thursday, November 16, 2000 at 6:00 pm. 
------------------------------------------


PROJECT DESCRIPTION

Use genetic algorithms to construct the most accurate hypothesis you can for predicting whether the income of a given person is >50K or <= 50K using the
census-income dataset from the US Census Bureau which is available at the Univ. of California Irvine Repository.

I have downloaded the dataset into the following directory: /cs/courses/cs539/f00/Projects/Census_Income_Data
You can access the dataset from there.

The census-income dataset contains census information for 48,842 people. It has 14 attributes for each person (age, workclass, fnlwgt, education, education-num, marital-status, occupation, relationship, race, sex, capital-gain, capital-loss, hours-per-week, and native-country) and a boolean attribute class classifying the input of the person as belonging to one of two categories >50K, <=50K.


PROJECT ASSIGNMENT

Construct, using a genetic algorithm, the most accurate hypothesis you can to predict the Salary attribute of the Census-Income data. The following are guidelines to use genetic algorithms to construct your hypothesis:

REPORT AND DUE DATE