Data Mining in Bioinformatics Series: Advanced Information and Knowledge Processing by Wang, J.T.L.; Zaki, M.J.; Toivonen, H.; Shasha, D.E. (Eds.) 1st Edition., 2005, XI, Springer.
java -Xmx768m -jar weka.jar
The main objective of this lecture is to become familiar with the Weka system.
@relation contact-lenses
@attribute age {young, pre-presbyopic, presbyopic} @attribute spectacle-prescrip {myope, hypermetrope} @attribute astigmatism {no, yes} @attribute tear-prod-rate {reduced, normal} @attribute contact-lenses {soft, hard, none}These attributes are the "column names" of the tabular data that comes after.
@data young,myope,no,reduced,none young,myope,no,normal,soft young,myope,yes,reduced,none young,myope,yes,normal,hard ...
Note that for Principal Components Analysis on the Weka's Attribute Selection Tab:
- CenterData=True does the following:
- Normalizes (= centers) the attributes (that is, makes each attributes mean = 0 by subtracting the attribute's mean from each attribute value)
- Applies PCA to the data's covariance matrix
- CenterData=False does the following:
- Normalizes (= centers) the attributes (that is, makes each attribute's mean = 0)
- Standarizes the attributes (that is, makes each attribute's standard deviation = 1)
- Applies PCA to the data's correlation matrix