WPI Worcester Polytechnic Institute

Computer Science Department

CS539 Machine Learning - Spring 2005 
Project 3 - Neural Networks


Due Date:


  1. Part 1. Construct the most accurate neural network you can for predicting the class attribute of each of the following datasets (available with the Weka System):
    • CPU dataset
    • Iris dataset

  2. Part 2. Construct the most accurate neural network you can for predicting the class attribute of the Covertype data available at the UCI Machine Learning Repository.


  1. Read Chapter 4 of the textbook about neural networks in great detail.

  2. Read the neural networks code in the Weka system in great detail.

  3. The following are guidelines for the construction of your neural networks:

    • Code: Use the neural networks methods implemented in the Weka system, or implement your own code. You can find the Weka module implementing neural nets under Classifiers, functions, MultilayerPerceptron.

    • Topology of your Neural Net: I suggest that you use a 2-layer, feedforward architecture. More specifically, a net consisting of (1 input layer,) 1 hidden layer, and 1 output layer. Each node in a layer is connected to each and everyone of the nodes in the next layer, and no nodes on the same layer are connected. However, you can experiment with other architures in addition to the one suggested here.

      In the case of non-numeric target attributes, decide on a convention that you'll use to match output nodes values and target attribute values.

    • Neural Net Parameters: Besides experimenting with the topology of the neural net, see how varying the learning rate, momentum, number of iterations (training time), decay, size of validation set, and other parameters affect the error backpropagation algorithm and the quality of its results.

    • Training and Testing Instances: You may restrict your experiments to a subset of the instances IF Weka cannot handle your whole dataset. But remember that the more accurate your neural network is, the better.

    • Preprocessing of the Data: A main part of this project is the preprocessing of your dataset. The neural networks implementation in the Weka system provides some data preprocessing capabilities (nominalToBinaryFilter, normalizeAttributes, and normalizeNumerClass). Experiment with that functionality and compare the performance of the error back propagation algorithm when those built-in capabilities are used vs. the perfomance when you pre-process the dataset prior to using neural networks. Compare also its performance with and without the removal of missing values.

      Your report should contained a detailed description of the preprocessing of your dataset and justifications of the steps you followed. If Weka does not provide the functionality you need to preprocess your data as you need to obtain useful patterns, preprocess the data yourself either by writing the necessary filters (you can incorporate them in Weka if you wish).

    • Evaluation and Testing: Experiment with different evaluation methods like split ratio, and n-fold crossvalidation. It would be ok to keep the number of folds low given that the training time may be quite high.