

Data Mining in Bioinformatics Series: Advanced Information and Knowledge Processing by Wang, J.T.L.; Zaki, M.J.; Toivonen, H.; Shasha, D.E. (Eds.) 1st Edition., 2005, XI, Springer.
java -Xmx768m -jar weka.jar
The main objective of this lecture is to become familiar with the Weka system.
@relation contact-lenses
@attribute age {young, pre-presbyopic, presbyopic}
@attribute spectacle-prescrip {myope, hypermetrope}
@attribute astigmatism {no, yes}
@attribute tear-prod-rate {reduced, normal}
@attribute contact-lenses {soft, hard, none}
These attributes are the "column names" of the tabular data that comes after.
@data
young,myope,no,reduced,none
young,myope,no,normal,soft
young,myope,yes,reduced,none
young,myope,yes,normal,hard
...
Note that for Principal Components Analysis on the Weka's Attribute Selection Tab:
- CenterData=True does the following:
- Normalizes (= centers) the attributes (that is, makes each attributes mean = 0 by subtracting the attribute's mean from each attribute value)
- Applies PCA to the data's covariance matrix
- CenterData=False does the following:
- Normalizes (= centers) the attributes (that is, makes each attribute's mean = 0)
- Standarizes the attributes (that is, makes each attribute's standard deviation = 1)
- Applies PCA to the data's correlation matrix