In particular, you should know what an association rule is;
metrics to quantify association rules (e.g., support, confidence, lift, leverage, conviction, interest factor, correlation analysis, IS measure, ...);
the Apriori principle;
the Apriori algorithms to construct association rules,
including frequent itemset generation and candidate generation and prunning
(join/merge condition and subset pruning), and
rule generation and confidence-based pruning.
You should be able to use these algorithms to construct association rules from data
by hand during the test.
See examples provided in the Lecture Notes linked above.
THOROUGHLY READ AND FOLLOW THE
These guidelines contain detailed information about how to structure your
project, and how to prepare your written summary, and how to study for the test.
For classification association rules (cars):
Pick either income (<$50K or >$50K) OR sex as the target attribute.
Decide which of these two attributes would be a better target and use it for all your classificiation experiments.
Use support, confidence, lift, leverage, and conviction. Include
in your report a definition (using a precise formula) and a description
of the meaning of each of these metrics.
Also, for extra credit you are
encouraged (but not required) to implement in Weka other association rule
metrics defined in Section 6.7 of the textbook (e.g., interest factor,
correlation analysis, IS measure, ...), and experiment with them.
Use visualizations of the sets of association rules obtained and
analyze those visualitions.
Read the association rules obtained and pick a handful of interesting
ones to describe in your report.
In constrast with our previous classification and regression projects,
we won't use any evaluation protocol (e.g., 10-fold cross validation)
for the association analysis of this project, as we're not using the
rules for prediction.
Focus instead on experimenting with different ways of preprocessing
the data, varying the parameters of the Apriori algorithm, and
providing your own method to evaluate the resulting collections of
association rules. Remember to experiment with car (that is, classification
association rules) and to compare its classification performance to that of
decision trees; and remember to experiment with non-car rules also.
Investigate in more depth (experimentally, theoretically, or both) a topic of your
choice that is related to association rule mining
and that is not covered already in this project.
This association rule mining -related topic might be something that was described or mentioned
in the textbook or in class, or that comes from your own research, or that is related
to your interests, or that appears in a research paper that you find intriguing.