WORCESTER POLYTECHNIC INSTITUTE
    Computer Science Department

    CS4341 ❏ Artificial Intelligence

    Version: Thu Mar 21 14:11:45 EDT 2013

    GA Diversity Selection Process

    From Winston, AI, Chapter 25

    The intention of this process is to make sure that individuals in a new population are not clustered in one region of the search space: crossover and mutation may not be able to break out of that. Selecting individuals for the next population based on diversity (i.e., little similarity) helps with that.

    1. Do all the normal steps involving fitness, crossover and mutation,
      but keep everything in the same population.
    2. Evaluate all the individuals in the population for their "quality" (i.e., fitness).
    3. Rank the individuals by their quality (i.e, 1st, 2nd, 3rd, ...)
    4. That position (e.g. 1) is each individual's Quality Rank value.
    5. Select the highest ranked individual for the next population.

    6. Next calculate the Diversity Rank value for each of the remaining individuals
      compared to the individual(s) already selected.
    7. Combine (add) the Diversity Rank and Quality Rank values
      to give a Combined Rank value for each remaining individual.
    8. Select the best combined ranked individual for the next population.
    9. Repeat the Diversity Rank and Combination calculations (from step 6)
      until "enough" individuals have been selected for the population.

    10. Discard any individuals not selected.
    11. Those selected form the new population.

    Diversity Rank Calculation:

    • To calculate the "Diversity Rank" of an individual: Sum the inverse squares (i.e., 1/di2 ) of "distance" between that individual and all the other individuals that have been selected already. This gives a distance score: smaller is better. Rank these distance sums to get the diversity rank.

    Distance Calculation:

    • To calculate the distance between individuals make some appropriate calculation based on the difference between the strings that represents the individuals. If they are very different then the difference should be high.