 
 


Consider the dataset below. This dataset is an adaptation of the IQ Brain Size Dataset.
@relation small_twin_dataset
@attribute CCMIDSA numeric 		% Corpus Collasum Surface Area (cm2)
@attribute GENDER {male,female}
@attribute TOTVOL numeric 		% Total Brain Volume (cm3)
@attribute WEIGHT numeric 		% Body Weight (kg)
@attribute FIQ numeric 			% Full-Scale IQ
@data
6.08,	female,	1005,	57.607,		96
5.73,	female,	963,	58.968,		89
7.99,	female,	1281,	63.958,		101
8.42,	female,	1272,	61.69,		103
6.84,	female,	1079,	107.503,	96
6.43,	female,	1070,	83.009,		126
7.6,	male,	1347,	97.524,		94
6.03,	male,	1029,	81.648,		97
7.52,	male,	1204,	79.38,		113
7.67,	male,	1160,	72.576,		124
Assume that we want to predict the FIQ attribute (prediction target) from the other predicting attributes CCMIDSA, GENDER, TOTVOL, and WEIGHT.
Show your work in your report.
CCMIDSA GENDER TOTVOL WEIGHT FIQ 6.48, female, 1034, 62.143, ? 6.59, male, 1100, 88.452, ?
Show your work in your report.
                                                         4-NN          
                                                         PREDICTION   
CCMIDSA	GENDER 	TOTVOL 	WEIGHT 		FIQ
6.48,	female,	1034,	62.143,		127		___________ 
6.59,	male,	1100,	88.452,		114		___________ 
                                                         4-NN        
                                                         ERROR      
root mean-square error (see p. 178)                      __________
mean absolute error (see p. 178)                         __________
       
         (w1*m1) + (w2*m2) + (w3*m3) + (w4*m4) 
         _______________________________________
                   w1 + w2 + w3 + w4 
      
     
     Show your work in your report.
                                                         4-NN          
                                                         PREDICTION   
CCMIDSA	GENDER 	TOTVOL 	WEIGHT 		FIQ
6.48,	female,	1034,	62.143,		127		__________
6.59,	male,	1100,	88.452,		114		__________
                                                         4-NN        
                                                         ERROR      
root mean-square error (see p. 178)                      __________
mean absolute error (see p. 178)                         __________
Show your work in your report.
CCMIDSA GENDER TOTVOL WEIGHT FIQ 6.48, female, 1034, 62.143, ? 6.59, male, 1100, 88.452, ?
Show your work in your report.
                                                         4-NN          
                                                         PREDICTION   
CCMIDSA	GENDER 	TOTVOL 	WEIGHT 		FIQ
6.48,	female,	1034,	62.143,		127		__________
6.59,	male,	1100,	88.452,		114		__________ 
                                                         4-NN        
                                                         ERROR      
root mean-square error (see p. 178)                      __________
mean absolute error (see p. 178)                         __________
       
         (w1*m1) + (w2*m2) + (w3*m3) + (w4*m4) 
         _______________________________________
                   w1 + w2 + w3 + w4 
      
     
     Show your work in your report.
                                                         4-NN          
                                                         PREDICTION   
CCMIDSA	GENDER 	TOTVOL 	WEIGHT 		FIQ
6.48,	female,	1034,	62.143,		127		__________
6.59,	male,	1100,	88.452,		114		__________ 
                                                         4-NN        
                                                         ERROR      
root mean-square error (see p. 178)                      __________
mean absolute error (see p. 178)                         __________
Assume that the 2 randomly selected initial centroids are:
CCMIDSA GENDER TOTVOL WEIGHT FIQ 6.48, female, 1034, 62.143, 127 6.59, male, 1100, 88.452, 114Show the first 2 iterations of the Simple K-means clustering algorithm. That is:
Show your work in your report.
Assume that the 2 randomly selected initial centroids are:
CCMIDSA GENDER TOTVOL WEIGHT FIQ 6.48, female, 1034, 62.143, 127 6.59, male, 1100, 88.452, 114Show the first 2 iterations of the Simple K-means clustering algorithm. That is:
Show your work in your report.
Assume that we have followed the COBWEB/CLASSIT algorithm and so far we have created a partial clustering containing just the first 4 instances. The following tree shows the current partial clustering, where the numbers in parenthesis (1), (2), (3), (4) represent the 1st, 2nd, 3rd, and 4th data instances respectively and "o" denotes just an internal node.
                             o
                          /     \ 
                         o      (2)
                       /   \
                      o    (3)
                    /   \
                   (1) (4)
Your job is to describe all the alternatives that are considered by the
COBWEB/CLASSIT algorithm when adding instance (5) to this clustering tree.
For this final project, no individual part is included. That is, all the work on this project (except for the individual homework above) is to be done with your group partner.