CS4341 Introduction to Artificial Intelligence. B-2003
SOLUTIONS - Exam 2 - December 18, 2003

Prof. Carolina Ruiz
Department of Computer Science
Worcester Polytechnic Institute


Problem I. Decision Trees (25 points)

Consider the following dataset that specifies the type of contact lenses that is prescribed to a patient based on the patient's age, astigmatism, and tear production rate. The purpose of this problem is to use information theory, more precisely entropy (or disorder), to construct a decision tree that predicts the type of contact lenses that will be prescribed to a patient based on the patient's attributes.

ID AGE ASTIGMATISM TEAR-PRODUCTION-RATE CONTACT-LENSES
1.young no normal soft
2.young yes reduced none
3.young yes normal hard
4.pre-presbyopic no reduced none
5.pre-presbyopic no normal soft
6.pre-presbyopic yes normal hard
7.pre-presbyopic yes normal none
8.pre-presbyopic yes normal none
9.presbyopic no reduced none
10.presbyopic no normal none
11.presbyopic yes reduced none
12.presbyopic yes normal hard

The partial decision tree below shows part of the decision tree constructed by using the entropy (or disorder) based algorithm.


                                  ASTIGMATISM
                               /               \
                         yes  /                 \ no
                             /                   \
                TEAR-PROD-RATE                  ???
	           /      \
          normal  /        \ reduced
                 /          \
               AGE          none
              / | \
       young /PP|  \ P
            /   |   \
         hard  none  hard

In this problem, you are asked to finish the construction of this decision tree following the entropy-based algorithm. In other words, you are asked to use entropy to select the attributes that will appear in the right subtree marked with a ??? above.
  1. (3 points) List the ID numbers of the data instances (i.e. rows in the dataset) that lie in the tree branch marked with a ??? above. Note that those data instances contain ASTIGMATISM=no
           
    • List of ID numbers: ___ 1, 4, 5, 9, 10 ______________________
  2. (15 points) Use the entropy formula to rank the attributes that can be used in the subtree marked with a ??? to predict the target/classification attribute (contact lenses). Show all the steps of the calculations. For your convenience, the logarithm in base 2 of selected values are provided.

    x 1 1/2 1/3 2/3 1/4 3/4 1/5
    log2(x) 0 -1 -1.6 -0.6 -2 -0.4 -2.3
    For the branch astigmatism=no we measure the entropy of the remaining two predictive
    attributes:
    age and tear-prod-rate
    
    AGE:
       contact lenses: 	hard		none			soft  
    young       	(1/5)*[	0 		- 0                     - 1*log2(1) ]        	= 0
    pre-presb	(2/5)*[	0 		- (1/2)*log2(1/2)	- (1/2)*log2(1/2) ]	= 0.4
    presb       	(2/5)*[	0 		- (2/2)*log2(2/2)	- 0 ] 			= 0
    										-------------
    											= 0.4
    
    TEAR-PROD-RATE
       contact lenses: 	hard		none			soft  
    normal		(3/5)*[	0		- (1/3)*log2(1/3)	- (2/3)*log2(2/3)]	= 0.56
    reduced		(2/5)*[	0		- (2/2)*log2(2/2) 	-0 ]	= 0
    										------------
    											= 0.56
    
    Since the gain of the attribute age has a smaller entropy, we add this attribute to
    the tree:
    
    	              ASTIGMATISM
                          /         \
                    yes  /           \ no
                        /             \
    	       TP-rate            AGE
    	      /      \            | \ \
         normal  /   red. \      young| P| \PP
                /          \          |  |  \
              AGE          none              \
             / | \
      young /PP|  \ P
           /   |   \
        hard  none  hard
    
  3. (7 points) Remember to continue constructing the subtree until all the branches end in nodes with a homogeneous classification of the attribute contact lenses. That is, each branch ends in a node that contains dataset instances (rows) that have exactly the same value for contact lenses.
    For the branch(astigmatism=no and age =young) all examples have class soft and for
    this branch the tree will have leaf soft.
    
    For the branch(astigmatism=no and age =presb) all examples have class none and for
    this branch the tree will have leaf none.
    
    The Branch (astigmatism=no and age =pre-presb) doesn't give a single class so we have to 
    add the last available attribute TP-rate to this branch. So we will obtain the final tree:
    	                            
                          ASTIGMATISM
                          /         \
                    yes  /           \ no
                        /             \
    	       TP-rate            AGE
    	      /      \            | \ \
         normal  /   red. \      young| P| \PP
                /          \          |  |  \
              AGE        none     soft none  \
             / | \                           TP-rate
      young /PP|  \ P                        /      \
           /   |   \               reduced  /        \ normal
        hard  none  hard                   /          \
    	                            none         soft
    
    

Problem II. Artificial Neural Networks + Genetic Algorithms (25 points)

Consider the following approach to training a neural net that uses genetic algorithms instead of the error backpropagation algorithm. Assume that we are training a 2-layer, feedforward neural net with 4 inputs, 1 hidden layer with 2 hidden nodes, and one output node. The output node produces an output that is a real number between 0 and 1. Also, assume that we have a collection of n training examples to train the net, each one with a desired output, which is also a real number between 0 and 1.

  1. Individuals in the population: Each individual in the population encodes a tuple of 13 real numbers, each one corresponding to the weight of a connection in the neural net. Note that the neural network contains 13 connections/links:

    An example of such an individual is:

    [w1=1.2, w2=-321.5, ..., w12=0.277, w13=-10.2]

  2. The fitness of an individual must represent how close the output computed by the corresponding net is to the desired output over all the training examples. The higher the fitness of an individual, the better the corresponding neural net computes the target function. The net corresponding to an individual is the neural net in which the 13 weights of the connections are as specified by the values in the individual.

  3. Describe in detail the genetic algorithm (or sequence of steps) that you would employ to search for the right values for the connections' weights of the neural net using this evolutionary approach.

Problem III. Machine Vision (25 points)

Consider the problem of automatically understanding which objects are present in a 3 dimensional scene from (a subset of) its 2 dimensional digital images. A 2 dimensional digital image is just a matrix of pixels. For each pixel, the matrix contains the pixel's grayscale value.

For each of the three main stages of the process, describe the stage in terms of its input/output behavior. That is, specify what the inputs and what the outputs of each of the following stages are:

  1. (9 points) Segmentation.
    Brief Description (NOT REQUIRED)
    Segmentation is the grouping of the pixels in the matrix into
    meaningful units.  When segmenting a matrix of gray-scale values (most
    of the time) the edges of these groupings would reflect the possible
    boundaries of objects.  There are two common methods (although many
    more exist):
    
    1. Threshold:
       under the threshold method, the range of possible values would be
       split up into n equally sized slots. Each pixel would belong to the
       group to which its value corresponds.  After processing the entire
       matrix, the groups are all the members of the slots.  This method is
       sometimes bad because it can lead to pixels completely unrelated in
       position to be grouped just on the basis of color.
    
    2. Assimilation:
       this is a similar strategy, but rather than belonging to a global
       group, each pixel only belongs to the pixels around it that are
       within a certain threshold function away from it. In this manner,
       similar groups are formed with similar pixels, but they are
       spatially isolated from similarly colored groups, leading to a
       better representation of the scene.
    
    
  2. (9 points) Determination of shape.
    Brief Description (NOT REQUIRED)
    The shape of an object can be inferred from shading, texture, and motion. 
    
    shading:
       -shadows provide an understanding of the curvature of an object
       -sudden shading changes represent corners or boundaries
       -uniform shading in a lit scene corresponds to flat surfaces
    
    texture:
       -texture provides a perspective on the object, providing depth
       -a good example is the dimples on a golf ball
    
    motion:
       -by monitoring progress as motion occurs, certain points can be 
        traced and a shape model developed
       
      also:
       -as an object moves, the change in the shading on the surface provides 
        information about the shape as well
    
    
  3. (9 points) Object Identification/Recognition (for instance, determining that an unknown object in the image is say an obelisk).
    Brief Description (NOT REQUIRED)
    The identification of the unknown object is done by matching the object
    against templates in a database.  This matching process require that
    one identifies corresponding points points in the unknown object and in
    the templates.  When only rotations around the y-axis are allowed, the
    number of templates needed is two and the the number of corresponding
    points needed is two.  Notice that for each point we do not need a z
    value since in 2-D it is undetectable.  Also we do not need the y
    coordinate since it always stays the same.  So, we only need to derive
    a function that predicts the interpolation of the x-coordinate between
    the two templates we are given.  The function is of the form x_need =
    Alpha * x1 + Beta * x2.  So, we need the two points so that we can
    solve a system of two equations in two unknowns.  The matching is done
    by finding the alpha and beta values for two points and then use those
    values of Alpha and Beta to predict the x coordinate of other control
    points in the unknown object based on the x coordinates of the control
    points in the templates.  Given a small threshold for error (or
    mismatch), this procedure will predict which template in the database
    the object fits, if any at all.  
    

Problem IV. Natural Language Processing (25 points)

For each of the following stages of the natural language understanding process, describe the inputs and the outputs of the stage. The more complete your lists of inputs and outputs, the higher your grade in this problem.
  1. (7 points) Speech Recognition
    Brief Description (NOT REQUIRED)
    During this stage, the analog signal is analyzed and split into
    different frequencies (or harmonics) using filters like for instance
    the Fourier transformation.  These frequencies are used to identify
    basic natural language sounds, called phonemes, that appear in
    the signal.  Identifying these phonemes may be done using a trained
    neural network or matching the sound (piece of input signal) against a
    library of known phonemes.  After the phonemes are identified then
    complete words are constructed from the phonemes.
    
    

    Some difficulties present during this stage (just to mention a few are):

    • There is no simple mapping between sounds and words. Statistical information on words and phonemes is needed to help disambiguate.
    • There is a great variance in pronunciation from person to person due to gender, dialect spoken by the person, accent, etc.
    • Even the same person may pronounce words differently depending on the context.
    • Same sound may correspond to different words.
    • It's hard to split the continuous signal into words due to the fact that people often phonetically connect contiguous words.
    • background noise may interfere.
  2. (7 points) Syntactic Analysis
    Brief Description (NOT REQUIRED)
    During this stage, the grammar of the natural language is used to
    determine the structure of the sentence.  A grammar is a collection of
    rules of syntax of the language.  If the sentence is syntactically
    ambiguous, this stage may output more than one possible syntactic
    structure for the sentence.  A classical example of such a sentence is
    "Fruit flies like a banana", in which the subject can either be "Fruit"
    or "Fruit flies".
    
    Given a sentence and a grammar, a parsing procedure constructs
    one or more possible parse trees, each one representing a
    possible structure for the sentence.
    
    Possible complications are:
    
    • Syntactically ambiguous sentences
    • Having to parse syntactically incorrect sentences
    • Natural languages are rich in declinations, adjectives, adverbs, etc. that make the analysis difficult.
  3. (7 points) Semantic Analysis
    Brief Description (NOT REQUIRED)
    The essence of this stage is to translate the sentence into the
    internal knowledge representation of the computer program, so that the
    program can reason and use the information/knowledge encoded in the
    sentence.
    
    During this stage, a partial translation is made based on the parse
    tree(s) obtained in the previous syntactic analysis stage.
    
    As before, ambiguity is a possible complication for this stage. That
    is, sentences that have a unique syntactic structure but more than one
    possible meaning.
    
    
  4. (7 points) Pragmatic Analysis
    Brief Description (NOT REQUIRED)
    The purpose of this stage is to complete as much as possible the
    translation of the original sentence into the internal representation
    of the computer program. For this, the context of the sentence is used
    to disambiguate as much as possible the meaning of the sentence, to
    resolve references (e.g. "Mary bought bananas. She said they are
    inexpensive". In the last sentence, She="Mary" and they="bananas").
    
    Possible complications in this stage are difficulties in understand the
    intentions of the speaker's message, metaphors, and semantic
    ambiguity.