Experiments:
You must run a sufficiently large and coherent set of experiments.
Start with a basic experiment with default parameters (if possible),
and design new experiments varying the settings
(i.e., pre-processing, parameters, and/or post-processing, ideally
varying one setting at a time)
based on the results that you obtain in your experiments.
Each experiment should be motivated by a previous experiment,
and by the guiding questions.
Also, unless otherwise stated, you need to work on each aspect of the project done both in Weka and in Python (separately), with more emphasis on Python than in Weka. This includes using k-fold crossvalidation and everything else. Functionality needed for the project that is not readily available in any Python package needs to be implemented in Python by you.
- For each experiment you ran describe:
- Objectives: Which of your 3 specific questions/conjectures
about the dataset domain you aim to answer/validate with
this experiment. Describe also any additional objectives for this
experiment that might have been motivated by your previous
experiments.
- Data: What data did you use to construct and test your model?
- Parameters and Settings:
Describe what parameter values and other settings you used
and why.
- Additional Pre or Post Processing:
Any additional pre or post processing done to the data or the
model in order to improve the model's performance,
as measured by the performance metric(s) chosen.
- Analysis of the constructed model:
- Describe the constructed model
(e.g., size of the model, readability).
If the model is readable summarize in your own words what the model
says, focusing on the most interesting/relevant patterns.
Elaborate on if and how the model answers the objectives of this
experiment.
- State what the performance of the model is, using the performance
metrics provided in the project description. If applicable,
elaborate on the confusion matrix and/or other relevant
performance indicators.
- How long it took Weka/Python to construct this model?
- Compare the performace of this model with that of other
models constructed in this project for this dataset.