D contains now 200 data instances, whose first column is a randomly generated number, and its second column tells if the number came from X or from Y. See D contents.X = random('Normal',90,10,1,100); Y = random('Normal',60,10,1,100); D(1:100,1) = X; D(1:100,2) = 1; D(101:200,1) = Y; D(101:200,2) = 2;
=== Run information === Scheme: weka.clusterers.EM -I 100 -N 2 -M 8.0 -S 100 Relation: em_example Instances: 200 Attributes: 2 A Ignored: class Test mode: Classes to clusters evaluation on training data === Model and evaluation on training set === EM == Number of clusters: 2 Cluster Attribute 0 1 (0.49) (0.51) ============================ A mean 89.7246 60.419 std. dev. 9.0504 8.8655 Clustered Instances 0 100 ( 50%) 1 100 ( 50%) Log likelihood: -4.1775 Class attribute: class Classes to Clusters: 0 1 <-- assigned to cluster 94 6 | 1 6 94 | 2 Cluster 0 <-- 1 Cluster 1 <-- 2 Incorrectly clustered instances : 12.0 6 %
obj = Gaussian mixture distribution with 2 components in 1 dimensions Component 1: Mixing proportion: 0.482694 Mean: 91.0523 Component 2: Mixing proportion: 0.517306 Mean: 60.0800