WORCESTER POLYTECHNIC INSTITUTE
Computer Science Department

CS4341 ❏ Artificial Intelligence

Version: Mon Oct 2 14:37:56 EDT 2006

Course Contents

1 The Intelligent Computer

Course Information
-- web, book, intro, projects, exams
The Field and the Book
- Definition:
  -- AI is the study of ...
  - computations that make it possible to perceive, reason, and act.
  - how to make computers do things which, at the moment, people do better.
  - the design of intelligent agents.
  - how to make computers act like those in the movies!
  -- Turing Test
- Engineering goal -- solve real-world problems
- Scientific goal -- explain various sorts of intelligence
- How AI has changed
  -- focus on systems that act rationally
- The Near-Term Applications
  -- e.g., routine design
  -- e.g., detect credit card fraud
- The Long-Term Applications
  -- what is still left to do...????
  -- chess? Deep Blue
  -- space? Remote Agent and Deep Space 1
  -- autonomous vehicles? DARPA Grand Challenge ( Briefing Slides)
  -- Video: Winning the DARPA Grand Challenge
- AI Sheds New Light on Traditional Questions
  -- computers provide new concepts & language
  -- computers require precision (e.g., what is "creativity"?)
  -- explore impact of technique or knowledge (add/remove)
  -- theories > computational models > implementations > results > refinements
  -- use of computers allows testing
  -- well tested methods used as tools
- AI Helps Us to Become More Intelligent
  -- suggests new/better ways to tackle problems
What Intelligent Systems Can Do
-- diagnosis, design, planning, scheduling, navigation, vision, tutoring, learning, ...
- Help Experts to Solve Difficult Analysis Problems
- Help Experts to Design New Devices
  -- Ulrich's function sharing problem (e.g., lamp chain/cord)
- Learn from Examples
  -- rules from sample data, e.g., ID3
  -- data mining (KDDRG)
- Provide Answers to English Questions
  -- Natural Language Understanding and Generation
- AI Is Becoming Less Conspicuous, yet More Essential
  -- Airport gate allocation
  -- many embedded applications (cars, washing machines, ...)
Criteria for Success
-- clear definition of task and implementable procedure for it
-- regularities or constraints available
-- other knowledge
-- solves real problem
-- provides new theory/method
-- suggests new opportunities

2 Semantic Nets and Description Matching

Representations
- Good Representations Are the Key to Good Problem Solving
  -- representation: a set of conventions about how to describe
  -- description: made using a representation
  -- Fig.2.1
  -- Farmer, Fox, Goose, Grain: node & link representation
  -- could show all states and all transitions
  -- safe states only -- constrained state space, reduces problem
  -- for searches often have Start and Goal states
  -- An aside, wrt searching:
  - what's a "state"?
  - what's an "operator"?
  - think of different tasks in different domians
  -- picking appropriate repr is key
  -- rich problems require rich descriptions
- Good Representations Support Explicit, Constraint-Exposing Description
  - Make important objects & relations explicit & visible
  - Expose natural constraints
  - Suppress irrelevant details
  - Makes things understandable, complete, concise
  - Are fast to use
  - Can create with procedure
- A Representation Has Four Fundamental Parts
  -- lexical (vocab.), structural (syntax), semantic (meaning), procedural (use)
- Semantic Nets Convey Meaning
  -- nodes (denoting objects), links (denoting relationships), labels (application specific).
  -- examples: state space, game tree, decision tree, ...
The Describe-and-Match Method
- describe-and-match method
  -- Fig.2.4
  -- an example of a Problem-Solving Method (PSM)
  -- specifies knowledge needed, I/O, and pattern of reasoning
  -- the method:
  - describe object using repr.
  - match description against "library"
  - if no match, failure
  - if satisfactory match, then announce.
  -- basis of Case-based Reasoning (CBR)
- Issues
  -- how to describe?
  -- how to match?
  -- what is "satisfactory"?
  -- e.g., CBR: partial match with adaptation
- Feature-Based Object Identification
  -- Fig.2.5
  -- example of describe-and-match
  -- describe using features
  -- represent object as point in multidimensional feature space
  -- e.g., capital letters (# of lines, # of curves) (D vs. M)
  -- how to identify object?
The Describe-and-Match Method and Analogy Problems
- Analogy problems: A is to B, as C is to x?
  -- describe rule of how A is to B
  -- find (C rule x)
- Geometric Analogy Rules
  -- Fig.2.10
  -- A Rule: Describes Object Relations and Object Transformations
  -- (object relations in A) + (object relations in B) + (A to B transformation)
  -- Note: use labelled links of different types (relation, transform)
- Scoring Mechanisms Rank Answers
  -- Fig.2.12
  -- Match: how to measure similarity of two rules?
  -- general problem of matching two representations
  -- Fig.2.13
  -- can weight relations differently (more or less essential to match)
  -- weights are often a problem (lack of knowledge)
- Ambiguity Complicates Matching
  -- multiple ways to describe how "A is to B"
  -- ambiguity is often a problem
The Describe-and-Match Method and Recognition of Abstractions
- Abstraction over representation
  -- Fig.2.22
  -- Fig.2.23
  -- abstracting sub-nets into nodes
  -- keep relationships between nodes
  -- enable more general results
  -- make patterns explicit, give new insight.
  -- levels of representation (detail)
  - helps with match?
  -- can also have pov in repr (electrical, heat)
Problem Solving and Understanding Knowledge
-- ask questions about knowledge
- What kind of knowledge is involved?
  -- objects? processes?
  -- "An ontology is a specification of a conceptualization."
- How should it be represented?
  -- logic? semantic nets? rules? frames? ...
- How much knowledge required?
- What exactly is the knowledge needed?
  -- e.g., feature space, library of known solutions, ...
- Is it available? and from where?
- transparent? (understandable)
- complete? (can say all of what's needed)

3 Generate and Test, Means-Ends Analysis, and Problem Reduction

Project 1 intro (Diagram)
PSM control is important
-- select best knowledge at best time
-- allocate resources (cheap methods first)
-- aim towards answer (goal)
-- reduce space searched
- "informed" control (avoid useless areas of space)
- use constraints (e.g., not fox + geese)
- use natural constraints (e.g., symmetry, redundancy)
- use suggestions (e.g., rules of thumb)(i.e., heuristics)
-- meta-knowledge (how useful is a piece of knowledge)
-- methods for control
- explicit control by procedures or PSMs
- message passing
- matching used for control (e.g., rules)
The Generate-and-Test Method
-- must be able to generate candidate solutions
-- must know how to test them
-- test must be easy
-- good if 'many' solutions in search space
-- good if small search space
-- goal known? goal recognized? satisficing?
-- no inherent direction
- Generate-and-Test Systems Often Do Identification
  -- Test conditions identify solution
  -- G&T for Design problems? for Diagnosis problems?
- Good Generators
  -- Complete -- eventually cover all possibilities
  -- Nonredundant -- dont waste time
  -- Informed -- propose only sensible possibilities.
  -- sometimes preceeded by Plan step (constrain G)
  -- could use feedback to affect G
The Means-Ends Analysis Method
- Key Idea Is to Reduce Differences
  -- difference between current and goal state
  -- use difference description to select procedure/operator
  -- difference table -- links differences to operators
  -- what to do if choice of operators?
  -- direction: work from current (forward)?
  -- direction: work from goal (backwards)? *
  -- these produce easier problem
  -- direction: work in the middle??
  -- produces easier problems
  -- recursive
The Problem-Reduction Method
- Moving Blocks Illustrates Problem Reduction
  -- convert goals into subgoals
  -- forms goal tree
  -- what's a "goal"?
  -- Fig.3.6
  -- PUT-ON uses GET-SPACE & GRASP & MOVE & UNGRASP
  -- Fig.3.8
  -- subgoals related by AND
  -- subgoals are ordered
  -- PUT-ON may indirectly use PUT-ON -- Recursive
  -- in general may be AND-OR tree (choice of decomp)
  -- subgoal dependencies?
- Example from Design problem (diagrams)
  -- hierarchical heuristic top-down problem decomposition
- Goal Trees Enable Introspective Question Answering
  -- How questions? look down
  -- Why questions? look up
  -- provides explanations --why good?
- Problem-Solving Methods Often Work Together
  -- different subproblems might need different PSMs
  -- key issue is to identify and characterize PSMs

4 Nets and Basic Search

Blind Methods
-- Fig.4.1
-- can search for goal or path
-- can search using path use costs (e.g., distances) or not
-- for any solution or best solution
-- with knowledge to guide, or not
-- if not, known as uninformed, weak, or blind
- Net Search Is Really Tree Search
  -- Fig.4.2
  -- root to leaf
  -- tree not known initially, gets generated
  -- node "expansion" makes a tree
  -- unexpanded nodes are "open" (else "closed")
  -- node has branching factor b
- Search Trees Explode Exponentially
  -- tree has depth d
  -- number of paths is b^d
  -- number "explodes exponentially" with tree depth.
  - size of space = 1 + b + b² + b³ + ... + b^d
  - for d=13, if b=2, size= 16,383
  - for d=13, if b=2.5, size= 248,352
  - for d=13, if b=3, size= 2,391,484!
- The Menu of Search strategies
  - Depth 1st -- keep to same path going deeper (blind)
  - Breadth 1st -- all nodes at current level, then next level (blind)
  - Hill Climbing -- like depth 1st but explore most gain 1st
  - Beam -- like breadth 1st but prune unpromising children
  - Best 1st -- expand best open node 1st, regardless of depth
  - Branch-and-bound -- expand least-cost-so-far node, bound at goal
  - A* -- like B-and-b but with heuristic info.
- Depth-First Search Dives into the Search Tree
  -- keep heading down same path
  -- commitment
  -- optimistic
  -- if no goal reached then "backup"
  -- backup to last choice point
  -- what space is needed?
  -- can be "depth limited"
  -- can use iterative deepening
- Breadth-First Search Pushes Uniformly into the Search Tree
  -- check all paths of same length, then all of next length, etc.
  -- expanding wavefront
  -- what space is needed?
- The Right Search Depends on the Tree
  -- depth-first bad for long failing paths
  -- depth-first sensitive to where goal node(s) is/are
  -- depth-first's first goal found may not be shortest path
  -- breadth-first uses a lot of space
  -- breadth-first is sensitive to branching factor
  -- breadth-first's first goal found will be shortest path
Heuristically Informed Methods
-- more knowledge tends to lead to less searching
-- what's a heuristic?
-- e.g. distance as crow flies instead of road distance.
-- used to determine quality of node
-- used to determine estimate of path to goal
-- usually only heuristic is available
-- heuristic should direct search, but not prune (why?)
-- consider the eight puzzle (example heuristic)
- Quality Measurements Turn Depth-First Search into Hill Climbing
  -- quality is height of hill
  -- quality surface
  -- e.g., estimate of how close to goal
  -- climb for quality
  -- climb in direction that is most beneficial
  -- stay on same path (depth-first search)
- Foothills, Plateaus, and Ridges Make Hills Hard to Climb
  -- Fig.4.7
  -- foothills - local maximum vs. global maximum
  -- plateaus - no slope to climb (aimless wandering)
  -- ridges - steps dont correspond to climbing slope
  -- fixes? backtrack, jump (some nondeterminism)
- Beam Search Expands Several Partial Paths and Purges the Rest
  -- based on breadth-first
  -- use best w only nodes at each level
- Best-First Search Expands the Best Partial Path
  -- use best of all open nodes at any level
  -- uses estimated quality of current node
- Search May Lead to Discovery
  -- to designs
  -- quality heuristic is "interestingness"
  -- operators add or change design elements
  -- goal test needed?

5 Nets and Optimal Search

The Best Path
-- search a network
-- for goal or for path
-- use quality measure
-- or path cost measure
-- usually heuristic
- Branch-and-Bound Search Expands the Least-Cost Partial Path
  -- Fig.5.2
  -- have cost for every action taken at a node
  -- expands the least-cost partial path
  -- bound: stops some nodes from being expanded (prune) (heuristic?)
  -- bounded if partial path cost >= goal path cost
- Adding Underestimates Improves Efficiency
  -- cost of total path through a node?
  -- e(total path length) = d(already travelled) + e(distance remaining)
  -- "e" is estimate (heuristic)
  -- "d" is distance (known)
  -- make e an underestimate
  - i.e., the real distance can't be less
  - overestimates may reach goal but not by best path
  -- expand node with lowest underestimated path
  - it predicts that the path through this node is best
  -- make e as accurate as possible
  -- if completely accurate?
Redundant Paths
-- discard them
-- keep only best S->i and best i->G
-- called dynamic-programming principle
- Underestimates and Dynamic Programming Improve Branch-and-Bound Search
  -- called A*
  -- very common approach
  -- u(S,G) = d(S,j) + u(j,G)
  -- u(j,G) is heuristic underestimate of cost of remaining path.
  -- can use iterative-deepening A* (i.e., IDA*)
- Robot Path Planning Illustrates Search
  -- Fig.5.6
  -- robot path planning example
  -- Fig.5.7
  -- configuration space obstacles
  -- Fig.5.9
  -- make visibility graph of sight lines
  -- do A* search over graph

6 Trees and Adversarial Search

Algorithmic Methods
-- Adversarial: more than 1 person trying to win (e.g., games)
-- why study?
- strategy
- uncertainty
- v. large state space
- fixed rules (operators/actions)
- small state descriptions
- Nodes Represent Board Positions
  -- Fig.6.1
  -- game tree = possible future board configurations
  -- node = board config
  -- link = a possible move from one player
  -- ply = levels in tree including root level
  -- each level represents possible situations for one player
- Exhaustive Search Is Impossible
  -- 10¹²⁰ possible branches in chess game tree
  -- use a "lookahead" procedure with situation evaluation
  -- what do we need?
  - Generator: what are all possible legal moves from position
  - (Heuristic filter: all "plausible" moves)
  - Evaluator: how good a move is for current player
  - Pruning: remove losing moves (most games dont allow backup)
- The Minimax Procedure Is a Lookahead Procedure
  -- Fig.6.2
  -- static evaluator = heuristic evaluation of board position quality
  - # of pieces
  - strength of pieces (queen > pawn)
  - mobility (poss. moves)
  - control (squares threatened)
  - threats (potential captures)
  - patterns of pieces (e.g., diagonal pawns)
  -- score - very +ve means player A wins, very -ve means player B wins
  -- maximizer - wants +ve scores (+10)
  -- minimizer - wants -ve scores (-10)
  -- assume each player will always pick move that is best for them
  -- goes to bottom of tree, evaluates
  -- back the scores up tree, "minimaxing" (minimize/maximize)
  -- pick move that avoids opponents best move(s)
  -- how far to expand tree?
  -- minimax is expensive (large trees)
- The Alpha-Beta Procedure Prunes Game Trees
  -- Fig.6.3
  -- dont expand a node that can't provide a score that's better than what you already have
- Alpha-Beta May Not Prune Many Branches from the Tree
  -- game tree branch order makes a difference
  -- Fig.6.6
  -- still exponential with depth
  -- time/space saved can allow deeper searches
Heuristic Methods
-- may prune the path that leads to a win!
-- i.e., into the valley to reach the hill
- Progressive Deepening Keeps Computing Within Time Bounds
  -- depth limited search, with d = 1, 2, 3, ...
  -- Anytime algorithm
- Heuristic Continuation Fights the Horizon Effect
  -- Fig.6.7
  -- fixed depth search produces a "horizon" (may be bad beyond it!)
  -- singular-extension -- if one move's value is much better than rest
  -- search-until quiescent -- look for quiet
- Heuristic Pruning Also Limits Search
  -- limit tree growth
  -- tapered search
  - rank nodes children by (fast) evaluation
  - b(child) = b(parent) - rank(child)
  - where "b" is number of branches to keep
- "Deep Thought" Plays Grandmaster Chess
  -- now "Deep Blue"
  -- see also
  -- uses alpha-beta search, with selective extensions
  -- could search to a depth of 12 ply
  -- has opening "book" and all five-or-fewer piece endgames
  -- massively parallel, 30-node, RS/6000, SP-based computer system enhanced with 480 special purpose VLSI chess chips
  -- evaluates 200,000,000 chess positions per second
  -- several months working with a grandmaster on evaluation function
  -- "In three minutes, ... it computes everything it knows about the current position from scratch."
- How do people play Chess...?

7 Rules and Rule Chaining

Rule-Based Deduction Systems
-- If-Then
-- antecedent-consequent
-- forward-chaining
-- satisfied, triggered, fired
-- working memory, rule base
-- LHS: boolean function that tests, or pattern(s)
-- RHS: actions, or pattern
-- nonmonotonic vs monotonic
- can a rule invalidate another? (order dependent)
-- good: rules are small slices of knowledge
-- bad: rules are small slices of knowledge
- Post's Theorem
  -- production systems can compute all computable functions
  -- hence, if intelligence is computable, productions can produce it
- Many Rule-Based Systems Are Deduction Systems
  -- deduction -- similar to logic
  -- with assertions (assert a fact)
  -- deduction - all triggered rules can fire
- A Toy Deduction System Identifies Animals
  -- Fig.7.2
  -- antecedent and consequent may include variables
  -- variables get bound to values
  -- (?x has-color black)
- Deduction Systems May Run Either Forward or Backward
  -- Fig.7.3
  -- backward chaining (goal directed reasoning)
  -- make hypothesis
  -- work back through rules to supportive known facts
  -- acts as problem decomposition
- The Problem Determines Whether Chaining Should Be Forward or Backward
  -- hard to figure out in practice
  -- combine knowledge of fan out, fan in, number of facts and number of conclusions
  -- try to minimize effort
  -- try to match human approach if that's important
- Control Issues
  -- infinite loops
  -- when to stop
  -- goal reached?
  -- rules used more than once?
Rule-Based Reaction Systems
-- condition-action rules
-- rules may assert, or do other actions
-- forwards, data-directed
A Toy Reaction System Bags Groceries
-- note "add-delete" syntax (IF a THEN DELETE b, ADD c)
-- use subtask name in WM (e.g., step is bag-small-items)
Reaction Systems Require Conflict Resolution Strategies
-- rule ordering
-- rule groups
-- conflict resolution strategies
- specificity (LHS)
- predefined priority
- data recency (use recent WM elements)
- most informative (RHS)
- etc. etc. etc.
-- how to pick?

Procedures for Forward and Backward Chaining

Depth-First Search Can Supply Compatible Bindings for Forward Chaining
-- must produce new assertions only
-- alternative binding
-- can search tree of alternative bindings in different ways
-- need all possibilities, or just one?
The Rete Approach Deploys Relational Operations Incrementally
-- Fig.7.11
-- for forward chaining rules
-- don't search, pre-index rules (trade space for time)
-- build rete for every rule
-- "pour" given assertions through whole rete and keep bindings
-- can then easily determine which rules are satisfied
-- key idea -- each rule firing changes WM very little
-- take WM change and put that through rete to see effect
-- all other bindings stay the same

8 Rules, Substrates, and Cognitive Modeling

Rule-based Systems Viewed as Substrate
- Explanation Modules Explain Reasoning
  -- Fig.8.1 & 8.2
  -- inference net & goal tree
  -- inference net -- shows flow of reasoning
  -- goal tree -- shows problem decomposition
  -- use to get explanation
  -- "how did you show...?" -- look down goal tree
  -- "why did you use...?" -- look up goal tree
- Reasoning Systems Can Exhibit Variable Reasoning Styles
  -- can use rule "providing assumptions"
  -- perhaps good to assume if expensive to show
  -- "providing A" = "if we assume A"
  -- "unless A" = "if we assume not A"
  -- reasoning modes (i.e., how to deal with assumptions)
  -- check all assumptions vs. ignore all assumptions
  -- not checking all assumptions makes result less reliable
- Probability Modules Help You to Determine Answer Reliability
  -- conclusions (assertions) are rarely certain
  -- rules are rarely certain
  -- If A and B THEN C -- both A as well as B may have probability
  -- so lhs (A and B) has a probability
  -- if lhs has probability and rule has probability, so does conclusion
- Two Key Heuristics Enable Knowledge Engineers to Acquire Knowledge
  -- knowledge engineering
  -- 1. ask about specific situations
  -- 2. distinguish between apparently similar situations
- Acquisition Modules Assist Knowledge Transfer
  -- Fig.8.5
  -- help knowledge engineers make new rules
  -- build tree of classes of rules (by conclusion type)
  -- form 'typical" rule of each type
  -- new rules are compared to typical as a check
  -- e.g., missing antecedent?
- Rule Interactions Can Be Troublesome
  -- new rules are rarely independent of existing ones
- Rule-Based Systems Can Behave Like Idiot Savants
  -- don't reason at multiple levels
  -- don't use constraint-exposing models
  -- don't show task structure
  -- don't look at problems from different perspectives
  -- don't know when to break the rules
  -- don't have access to the reasoning behind the rule
Rule-Based Systems Viewed as Models for Human Problem Solving
- Rule-Based Systems Can Model Some Human Problem Solving
  -- consider WM to be Short term Memory (STM)
  -- rules used to model actions on STM
  -- 7 +- 2 chunks
  -- can hypothesize rules for simple human tasks
  -- e.g., arithmetic
- Protocol Analysis Produces Production-System Conjectures
  -- protocol collection -- talk while solving problem
  -- protocol analysis
  -- infer productions
  -- infer changes in state of knowledge
  -- form problem-behavior graph
- SOAR Models Human Problem Solving, Maybe
  -- SOAR is an architecture
  -- an integrated collection of representations and methods
  -- Also a theory of cognition
  - Problem spaces as a single framework for all tasks and subtasks to be solved
  - Production rules as the single representation of permanent knowledge
  - Objects with attributes and values as the single representation of temporary knowledge
  - Automatic subgoaling as the single mechanism for generating goals
  - Chunking as the single learning mechanism
  -- it breaks rule-based systems into basic units of representation and action
  -- Impasse is a situation where SOAR doesnt know how to proceed
  -- Sub-goal created to resolve the impasse
  -- Resolved impasse triggers chunk formation
  -- Chunk is new rule that describes how impasse was dealt with
  -- Search control is encoded in production rules that create preferences for operators
  -- SOAR acts on goals, problem spaces, states, operators.
  -- SOAR does propose, compare, select, refine on each thing.

9 Frames and Inheritance

Frames, Individuals, and Inheritance
-- frames as knowledge representation
-- frames as model of memory
-- at level above semantic nets
-- can represent objects, actions, relationships, ...
- Frames Contain Slots and Slot Values
  -- Fig.9.1
  -- frames have slots
  -- slots have a value
  -- in other models of frames, slots may contain meta-knowledge, defaults, etc.
- Frames may Describe Instances or Classes
  -- Fig.9.2
  -- instance frames represent individuals
  -- class frames represent classes
  -- "Is-a" is-a-member-of-the-class
  -- Grumpy Is-a Manager (I to C)
  -- "Ako" a-kind-of
  -- Manager Ako Competitor (C to C)
  -- allows property inheritance
  -- note: birds fly, a penguin is a bird, hence...?
  - i.e., may need to delete property (block inheritance)
  - add property (corgis have smooth coats)
  - modify property (3-legged tables)
- Inheritance Enables When-Constructed Procedures to Move Default Slot Values from Classes to Instances
  -- instances inherit slots (and values)
  -- allows knowledge to be written in one place
  -- hence easier knowledge maintenance
  -- values in classes represent default knowledge for instances
  -- e.g., all dogs have color brown
  -- classes can have when-constructed procedures
  -- also known as "attached actions"
  -- they can be inherited
- A Class Should Appear Before All Its Superclasses
  -- Fig.9.4
  -- if frame has >1 parent frame, not a strict hierarchy
  -- multiple inheritance (often prohibited)
  -- need procedure to decide class-precedence list
  -- many possible ways
  -- can lead to contradictions
Demon Procedures
- When-Requested Procedures Override Slot Values
  -- can provide a value even if one isnt present
  -- can override existing value
  -- can do simple inference
- When-Written Procedures Can Maintain Constraints
  -- captures constraints between values
  -- if new value of slot A is 10 then value of slot B must be 20
- With-Respect-to Procedures Deal with Perspectives and Contexts
  -- size of dog from perspective of ant is large
  -- size of dog from perspective of elephant is small
  -- mood of student in context of class is grumpy
  -- mood of student in context of pizza is happy
- Inheritance and Demons Introduce Procedural Semantics
  -- procedures add meaning to frames
  -- procedures are part of the representation
Frames, Events, and Inheritance
- Digesting News Seems to Involve Frame Retrieving and Slot Filling
  -- stereotypical events have expected slots
  -- i.e., produce expectations
  -- things play fixed roles
  -- expectations help understanding
  -- how to select the right event frame?
Frames as a theory of memory
-- stereotypical objects (typical & possible values)
-- hierarchy & inheritance
-- expectations (e.g., entering a kitchen I see...)
-- defaults
-- inference triggered by different situations

10 Frames and Commonsense

Thematic-role Frames
-- language describes actions and change
-- frames can be used to represent meaning
- An Object's Thematic Role Specifies the Object's Relation to an Action
  -- Fig.10.1
  -- Verbs = actions
  -- Noun phrases = thematic roles
  -- e.g., agent, thematic object, instrument
  -- constraints introduced by verbs
  -- not all verbs allow all roles
  -- thematic roles indicated by words, e.g., with, to, by, near, before
  -- some thematic roles:
  - "Agent" responsible for action
  - "Beneficiary" is who action is for
  - "Thematic object" is what sentence is about
  - "Instrument" is used as tool in action
  - "Source/Destination" refer to physical position changes
  - "Time" is when action done
  - "Location" is where action done
- Filled Thematic Roles Help You to Answer Questions
  -- questions tend to be about one thematic role
  -- e.g., "With what...?" (instrument)
- Various Constraints Establish Thematic Roles
  -- NPs have thematic role ambiguity
  -- verbs constrain what roles, and where in sentence NPs go
  -- prepositions constrain NPs roles (e.g., from --> source)
  -- nouns constrain role possibilities (e.g., inanimate)
- A Variety of Constraints Help Establish Verb Meanings
  -- verbs & VPs have meaning ambiguity (e.g., shot, saw)
  -- NP may disambiguate (e.g., shot the rabbit)
  -- Particles may disambiguate (e.g., threw away vs. threw up)
- Constraints Enable Sentence Analysis
  -- have dictionary of info about nouns and verbs
  -- find verb
  -- find thematic object
  -- handle other noun phrases
  -- use constraints throughout
- Examples Using Take Illustrate How Constraints Interact
  -- example using "take"
  -- transport, swindle, swallow, steal, date, remove, control
Expansion into Primitive Actions
-- underlying meanings for verbs
-- what assumptions can be made for an action
-- e.g., how done, where done, with what, ...
-- explain relationship between "buy" and "sell"
- Primitive Actions Describe Many Higher-Level Actions
  -- small number of primitives to describe actions
  -- Basic English (1000 words)
  -- sample primitives:
  -- how do you know you have the right set?
- Actions Often Imply Implicit State Changes and Cause-Effect Relations
  -- Fig.10.6
  -- action Result state-change
  -- Fig.10.7
  -- action Result action
- Actions Often Imply Subactions
  -- Fig.10.10
  -- person moving block implies body parts moving too
  -- person eating implies body part moving to move instrument
- Primitive-Action Frames and State-Change Frames Facilitate Question Answering and Paraphrase Recognition
  -- patterns of primitives can be matched to db
  -- db of "scripts" (stereotypical actions in known situations)
  -- do two sentences have same meaning? (e.g., buy & sell)
- CYC Captures Commonsense Knowledge
  -- what is Cyc?
  -- For fun: video presentations about Cyc
  -- For fun: HAL & Cyc

11 Numeric Constraints and Propagation

Propagation of Numbers Through Numeric Constraint Nets
- Numeric Constraint Boxes Propagate Numbers through Equations
  -- Fig.11.1
  -- equation is a constraint
  -- A=B+C given A=3, B=2, C=2 ?
  -- set of equations = set of constraints
  -- values can propagate
  -- A given A=B+C and B=2, C=2 ?
  -- Direction? A=B+C, B=A-C, C=A-B
  -- constraints connect via common variables
  -- arithmetic constraint net
Propagation of Probability Bounds Through Opinion Nets
-- Fig.11.4
-- most opinions aren't certain
- Probability Bounds Express Uncertainty
  -- nodes are AND and OR
  -- a values is a range of probability (e.g., V = [0.25, 0.75] )
  -- l(V) is lower (0.75)
  -- u(V) is upper (0.25)
  -- OR(A, B) gives [l(A or B), u(A or B)]
  -- AND(A, B) gives [l(A and B), u(A and B)]
Propagation of Surface Altitudes Through Arrays
-- arrays with values, may be sparse, or contain errors
-- constraints can express local relationships in array
-- can do smoothing of images
-- can do filling in of sparse data
-- e.g., fill holes with average of surroundings
- Local Constraints Arbitrate between Smoothness Expectations and Actual Data
  -- Fig.11.9
  -- sparse data with value and with confidence
  -- relaxation formula
  -- replace current value with new value based on a combination of current value and average of neighbors: both weighted by confidence
  -- 30 data points, hence effectively 30 copies of formula
  -- actually sweep formula across data propagating new values through array
  -- repeat until stable
  -- Fig.11.10
- Constraint Propagation Achieves Global Consistency through Local Computation
  -- most confident values affect neighbors
  -- and their neighbors, etc.
  -- 30 local constraints lead to global consistency

12 Symbolic Constraints and Propagation

Propagation of Line Labels through Drawing Junctions
-- numbers, probability intervals
-- next... symbolic labels
- There Are Only Four Ways to Label a Line in the Three-Faced-Vertex World
  -- Fig.12.1
  -- given a line drawing of a scene
  -- find an interpretation
  -- Fig.12.2
  -- i.e., label all lines as boundary (<)(>), convex (+), concave (-)
  -- junctions of lines = vertices of object
  -- Fig.12.4
  -- junction label (e.g., fork +++)
  -- natural constraints limit possible junction labels
  -- 208 possible junction labellings, but only 18 possible
  -- start with simplifying assumptions (e.g., no shadows)
- There Are Only 18 Ways to Label a Three-Faced Junction
  -- Fig.12.11
  -- 208 possible junction labellings, but only 18 possible
  -- Fig.12.14
  -- e.g., only 5 forks (incl. +++, ---)
- Finding Correct Labels Is Part of Line-Drawing Analysis
  -- Fig.12.16
  -- interior junction labellings usually ambigous (e.g., +++ vs ---)
  -- principle: exploit constraints and regularities!
  -- constraints come from:
  - regularities in the world (due to physics or people)
  - surfaces being planar
  - uniform color
  - uniform texture
  - lighting
  - continuity of edges
  - etc.
- Waltz's Procedure Propagates Label Constraints through Junctions
  -- Fig.12.19
  -- line label constrains junctions at each end,
  -- and junctions at each end constrain line label.
  -- put set of appropriate junction labels (e.g., fork) at junction
  -- move to next junction, do same
  -- remove incompatible labels
  -- propagate changes
  -- do for all junctions, until stable
  -- Fig.12.20
- Many Line and Junction Labels Are Needed to Handle Shadows and Cracks
  -- Fig.12.24
  -- both add more labels and more constraints
- Illumination Increases Label Count and Tightens Constraint
  -- adds more labels and more constraints
  -- now 11 instead of just 4 line labels
  -- L vertex: 2.5 * 10³ possible junctions, 80 actual
- The Flow of Labels Can Be Dramatic
  -- visit border junctions first (less ambiguous)
  -- may take only a few visits to an internal vertex
- The Computation Required Is Proportional to Drawing Size
  -- work is roughly linear wrt number of lines
  -- Note: obscuring objects restrict flow
Propagation of Time-Interval Relations
- There Are 13 Ways to Label a Link between Interval Nodes Yielding 169 Constraints
  -- Fig.12.28
  -- 13 possible relations between 2 time intervals
  -- Fig.12.29
  -- e.g., A before B & B before C ----> A before C
Constraint Satisfaction Problems in general
-- CSP: set of vbls, set of constraints
-- each vbls has set of possible values
-- find an assignment of values that satisfies constraints
-- e.g., map coloring, 8 queens
- CSP solutions by Searching
  -- solution possible via backtracking depth first search
  -- often a huge search space
  -- which variable next? pick most constrained vbl (fewest values)
  -- forward checking: look ahead one variable to see/record impact
- CSP solutions by Constraint Propagation
  -- Arc Consistency: arc from vbl to vbl, represents binary constraint
  -- e.g., X <---different-color---> Y, with X={red, blue}, Y={red}
  -- look in both directions
  -- there exists a consistent assignment in X for all Y (blue)
  -- Y-to-X is consistent
  -- NOT there exists a consistent assignment in Y for all X
  -- X-to-Y is not consistent
  -- adjust X to produce consistency (delete red)
- CSP solutions by Min-Conflicts heuristic
  -- e.g., for N-queens
  -- assign "reasonable" values to all variables
  -- repeat

13 Logic and Resolution Proof

Rules of Inference
-- some casual definitions...
-- Inference: deriving new stuff from what's known
-- Deduction: producing new facts from old facts using inference rules
-- Induction: produce general description from specific examples
-- Abduction: producing a likely fact from an old fact and an inference rule
-- which are truth preserving?
-- Logic: formal language: syntax & semantics
-- a knowledge representation associated with deduction
-- Propositional logic (no variables)
-- 1st order predicate calculus
-- true, false (2 valued)
- Logic Has a Traditional Notation
  -- predicate: function that gives true or gives false
  -- predicate(symbol) {e.g., Green(x) }
  -- symbol denotes something satisfying predicate
  -- Fig.13.1
  -- conjunction (&), disjunction (v), negation (~), implication (==>)
  -- truth table (for A==>B) (TFF)
  -- substitutions (e.g., A==>B <==> ~AvB )
  -- de Morgan's Laws
- Quantifiers Determine When Expressions Are True
  -- universal quantifier: for all
  -- existential quantifier: there exists (one or more)
  -- 1st order predicate calculus (variables represent objects)
  -- 2nd order? (variables can represent predicates)
- Logic Has a Rich Vocabulary
  -- Fig.13.2
  -- literals: P(x), ~P(x)
  -- wffs: literals, literals with v & ~ ==>
  -- wffs: with quantifiers
  -- e.g., Ax[Person(x) ==> Mortal(x)]
  -- e.g., Person(Susan) ==> Mortal(Susan)
  -- clause: literal v literal
- Interpretations Tie Logic Symbols to Worlds
  -- Fig.13.3
  -- symbols <------------> objects (in imaginable world)
  -- predicates <------------> relations (in imaginable world)
  -- provides an interpretation
- Proofs Tie Axioms to Consequences
  -- proof (derive true expressions: theorems)
  -- given axioms (stated as true)
  -- use sound rules of inference
  -- special
  -- Modus Ponens: given axioms A, A==>B then B logically follows
- Resolution Is a Sound Rule of Inference
  -- i.e., it is truth preserving
  -- given axioms AvB, ~BvC
  -- resolvent is AvC
Resolution Proofs
- Resolution Proves Theorems by Refutation
  -- Fig.13.4
  -- assume negation of theorem to be shown is true
  -- show that proof attempt leads to conflict
  -- conclude that theorem must be true
- Using Resolution Requires Axioms to Be in Clause Form
  -- e.g., ~Brick(x) v ~On(u,y) v ~On(y,u)
  -- key steps to get clause form
  - eliminate implications
  - move negations: e.g., ~Ax[P(x)] --> Ex[~P(x)]
  - eliminate existential quantifiers (Skolem functions)
  - rename variables if necessary
  - move universal quantifiers to the left
  - move disjunctions down to literals
  - eliminate conjunctions
  - rename variables
  - eliminate universal quantifiers
- Proof Is Exponential
  -- cant guide it with MEA or A*
  -- can use control strategies that may help
  -- e.g., unit preference, set-of-support
  -- search may be exponential (i.e., long proofs are bad!)
  -- search may not terminate if there isn't a proof
  -- semidecidable: tell you only if it's a theorem
- Resolution Requires Unification
  -- resolutions requires literals to match
  -- e.g., ~On(x, Table) with On(Block, y)
  -- substitution that makes the clauses resolve is "unifier"
- Traditional Logic Is Monotonic
  -- theorems added, never removed
  -- number of clauses only increases
  -- compare with planning
- Theorem Proving Is Suitable for Certain Problems, but Not for All Problems
  -- problem with long proofs
  -- some knowledge very hard to formalize
  -- some knowledge requires special logic

14 Backtracking and Truth Maintenance

Chronological and Dependency-Directed Backtracking
-- remember the depth-first search
-- if blocked backs up to last choice point and make another
-- Chronological Backtracking (wrt time/order choice made)
- Limit Boxes Identify Inconsistencies
  -- "limit boxes" are constraints on values (e.g., <2 )
  -- choices propagate through equations
  -- violating limit causes conflict
  -- conflict requires backtracking
  -- examples of "conflicts"
  - blocked paths in path/route finding
  - failing constraints in CSP solving search
  - new information that says that deduced/propagated value is wrong
  - assumption shown incorrect
- Chronological Backtracking Wastes Time
  -- chronological backtracking undoes decisions that may be OK
  -- chronological backtracking doesnt respond to cause of conflict
  -- major problem if bad decision was 1st made
- Nonchronological Backtracking Exploits Dependencies
  -- need Nonchronological Backtracking
  -- use dependencies (choices that contributed to conflict)
  -- need clean way to keep dependency info (as "justifications")
Proof by Constraint Propagation
- Truth Can Be Propagated
  -- propagate truth values through constraints on truth
  -- constraints are rules of logical operators (e.g., A==>B, TFF)
  -- literals or expressions: true, false, unknown
- Truth Propagation Can Establish Justifications
  -- propagate truth through net, keep records
  -- justification links from new truth value to dependencies
- Justification Links Enable Programs to Change Their Minds
  -- can make Assumptions about truth values (e.g., assume True)
  -- more useful to have a value instead of none
  -- justification for value is "assumption"
  -- assumptions may lead to conflicts
  -- assumptions may be undone by known truth values
  - could record propositions that, if found, will indicate assumption is false.
  - e.g., assume "p", record "~p" as a problem
  -- backtracking guided by justification links
- Proof by Truth Propagation Has Limits
  -- works with propositional logic only (no variables)

15 Planning

Now starting on "Applications"
Planning Using If-Add-Delete Operators
-- plan: sequence of actions intended to achieve goal(s)
-- what if we were to use logic? (monotonic)
-- initial situation, goal situation, operators
-- search for sequence of operators
-- transfor initial situation into goal situation
-- Fig.15.1
-- consider sample problem
-- what are possible paths?
-- why plan?
- can anticipate problems
- can search in model and not in world
- can aid in error recovery if plan doesnt work
- Operators Specify Add Lists and Delete Lists
  -- operators
  - have preconditions (prerequisites)
  - use variables for generality (instantiate when op used)
  - have add list
  - have delete list (i.e., not monotonic)
  -- must describe everything relevant for ops to work
- You Can Plan by Searching for a Satisfactory Sequence of Operators
  -- search can be exponential due to # of ops and possible bindings
  -- i.e., don't do linear search
- Backward Chaining Can Reduce Effort
  -- Fig.15.3
  -- one goal, therefore try backward chaining from goal
  -- i.e., look for operator that adds all or part of goal
  -- use op preconditions as new goal (may branch)
  -- set up Establishes links
  -- get complete plan, with partial order (POP)
  -- topological sort gets linear plan
  -- trying for linear plan from scratch is flawed
- Impossible Plans Can Be Detected
  -- try more than one goal (may be order sensitive)
  -- Fig.15.4
  -- delete of operator may interfere with precond of another (Threat)
  -- Fig.15.5
  -- check if ordering ops can remove threat (no Before loops!)
  -- Fig.15.6
  -- Fig.15.7
  -- Fig.15.8
- Partial Instantiation Can Help Reduce Effort Too
  -- instantiate as much as necessary
  -- e.g., put block down on x
  -- in general can do for both objects and actions
  -- principle of "least commitment" (benefits?)
  -- hierarchical planning
Planning Using Situation Variables
-- try to use logic (monotonic, remember?)
-- operators take one situation to another (sequence)
- Finding Operator Sequences Requires Situation Variables
  -- add Situation Variables
  -- On(A,B,s₁) but ~On(A,B,s₂)
  -- set up goal as situation
  -- use operations that represent actions
  -- operations take arguments and a situation, and ...
  -- produce new situation that makes a predicate true
  -- e.g., if x isn't on the table in s then in the new situation after using STORE it is.
  -- express all in predicate calc
  -- put all in clause form
  -- use resolution to refute ~goal
  -- Fig.15.11
  -- use Answer term
- Frame Axioms Address the Frame Problem
  -- No! not that kind of frame! (scope)
  -- if On(A,Table,s)&On(B,Table,s), move A, where's B in new situation?
  -- need Frame Axioms ("how predicates survive operations")

16 Learning by Analyzing Differences

-- some say no learning ==> no intelligence
-- what is learning?
-- types?
-- supervised, unsupervised, reinforcement
-- Rote Learning, Learning from Advice, Learning from Examples, Explanation based, by Discovery, etc.
-- also methods within types (e.g., Analyzing Differences)
-- level change between what is given and what is learned?

e.g., generate general description from detailed examples (inductive)
e.g., could store information as it is presented (cases)

-- sensitivity to noise (what's noise?)

Induction Heuristics
-- supervised
-- learning by induction from "well-chosen" given examples
-- inductive reasoning (from specific to general)
-- learn a concept (class description)
-- Fig.16.1
-- given +ve and -ve examples
-- do examples matter?
-- does order matter?
- Responding to Near Misses Improves Models
  -- -ve examples are "near-miss" (not much wrong)
  -- model repr and example repr the same
  -- the right repr enables learning (e.g., explicit spatial relationships)
  -- need to isolate what's important in examples
- Responding to Examples Improves Models
  -- Fig.16.2
  -- can require relations (e.g., must-support)
  -- Fig.16.3
  -- can forbid relations (e.g., must-not-touch)
  -- can enlarge set of types (e.g., new A-or-B class)
  -- Fig.16.4
  -- can generalize types (e.g., "brick" to "block")
- Near-Miss Heuristics Specialize; Example Heuristics Generalize
  -- -ve examples restrict model (it was too general)
  -- +ve examples relax model (it was too specific)
  -- heuristics:
- Learning Procedures Should Avoid Guesses
  -- wait-and see principle (if in doubt, do nothing) (commitment?)
  -- no-altering principle (create a special case)
- Learning Usually Must Be Done in Small Steps
  -- it's easier to learn something you almost know
  -- you need the right concepts to be able to learn (e.g., need brick to learn arch)
Identification
- Must Links and Must-Not Links Dominate Matching
  -- describe & match (e.g., is unknown object an arch?)
  -- similarity dominated by must and must-not links
- Models May Be Arranged in Lists or in Nets
  -- Fig.16.5
  -- similarity net
  -- similar models linked by their differences
  -- similar to "graph of models" used to select model for analysis

19 Learning by Recording Cases

Recording and Retrieving Raw Experience
-- sometimes good models are impossible to build
-- need to resort to storing examples (cases)
-- need to index cases
-- may need to adapt them to new situation
-- when is use of cases good? bad?
- The Consistency Heuristic Enables Remembered Cases to Supply Properties
  -- assume consistency
  -- i.e., unknown property same as known
  -- Fig.19.1
  -- Fig.19.2
  -- e.g., feature space of blocks (given H and W, color?)
Finding Nearest Neighbors
- A Fast Serial Procedure Finds the Nearest Neighbor in Logarithmic Time
  -- Fig.19.7
  -- use decision tree
  -- each nodes has test (e.g., Width > 3 ?)
  -- branches depending on answers
  -- leaf node has specific result
  -- in blocks example use a 2D k-d tree
  -- Fig.19.5
  -- Fig.19.6
CBR: Case-Based Reasoning
-- have stored cases (e.g., recipes)
-- indexed (perhaps via k-d tree)
-- given ingredients find recipe (e.g., chicken, asparagus)
-- find chicken & dumplings, and chicken & broccoli
-- test for success, select chicken & broccoli
-- not quite right, therefore "adaptation" needed
-- substitution, gives needed recipe
-- retain the final solution as a new case
-- what if adaptation by substitution isnt enough?
-- cases for everything?
- case-based planning?
- case-based diagnosis?
- case-based design?
- case-based chess?
- case-based soccer?

20 Learning by Managing Multiple Models

The Version-Space Method
-- needs noise free data
-- needs sequence of +ve and -ve examples
-- -ve don't need to be near miss!
-- builds a description (model) that describes data
-- uses records from database
-- does a kind of "data-mining"
-- record has values for situation-characterizing attributes
-- e.g., (place, meal, day, cost) plus "+ve" or "-ve"
-- Sample data
-- example: (Sam's, Dinner, Thursday, Expensive)
-- +ve if person gets allergic reaction
-- needs known set of attributes
- Version Space Consists of Overly General and Overly Specific Models
  -- Fig.20.1
  -- version space: between most general and most specific description
  -- Negative examples specialize general descriptions (restrict)
  -- Negative examples prune the specific descriptions
  -- Positive examples generalize specific descriptions (relax)
  -- Positive examples prune the general descriptions
- Generalization and Specialization Leads to Version-Space Convergence
  -- Fig.20.2
  -- Each specialization must be a generalization of some specific model
  -- No specialization can be a specialization of another general model
  -- Figs. 20.3 to 20-7
Version-Space Characteristics
-- Result: even if you don't get a single model you get something useful
-- Could this method be used for learning an Arch model?

21 Learning by Building Identification Trees

From Data to Identification Trees
-- widely used technique for learning
-- trees can be used to generate rules
-- Sample data
-- uses records with values for fixed set of attributes
-- e.g., (Name, Hair, Height, Weight, Lotion)
-- plus classification for that record (e.g., sunburned)
-- e.g. (Sarah, blonde, average, light, no, sunburned)
-- samples may have noise (meaning?)
- The World Is Supposed to Be Simple
  -- does Name affect sunburn?
  -- learning process prunes attributes that dont affect classification (useful)
  -- build identification tree (type of decision tree)
  -- Fig.21.1 and 21.2
  -- Occam's razor (for identification trees): small is good
- Tests Should Minimize Disorder
  -- pick which attribute for root node and work down
  -- pick attributes based on "sorting power"
  -- i.e., minimizes disorder
  -- Fig.21.3 and 21.4
  -- i.e., divides data into most homogenous subsets
- Information Theory Supplies a Disorder Formula
  -- Fig.21.5
  -- information "entropy" in information theory
  -- Disorder measured down each branch under a node
  -- Average disorder is weighted sum across all those branches
  -- Find average disorder for each unused attribute (at that tree level)
  -- Pick one with least average disorder (i.e., strongest sorting power)
  -- repeat down the tree
- Try the Aussie example
From Trees to Rules
-- trace each path to get a rule
-- leaf node class is the consequent
- Unnecessary Rule Antecedents Should Be Eliminated
  -- if blonde and uses-lotion then no-suntan
  -- try to remove antecedent, does it make a difference?
  -- if not, cut it.
  -- e.g. everyone who uses lotion avoids sunburn, so blondness is irrelevant
- Unnecessary Rules Should Be Eliminated
  -- once rules simplified, try to reduce # of rules
  -- use default rule
  -- use default rule that produces simplest rules

25 Learning by Simulating Evolution

Survival of the Fittest
- Chromosomes Determine Hereditary Traits
  -- genes determine traits
  -- chromosomes has list of genes
  -- gene scrambling is called crossover
  -- altered genes called mutation
- The Fittest Survive
  -- evolution through natural selection
  -- traits determine fitness
  -- fitness determines survival
  -- fitness determines breeding
  -- breeding determines survival of traits
  -- traits passed to offspring
Genetic Algorithms (GAs)
- Genetic Algorithms Involve Myriad Analogs
  -- GAs use analogies with individuals, populations, chromosomes, genes, mutation, crossover, fitness, natural selection.
  -- Individuals have fitness, represented by the Quality Score of their chromosome.
  -- Populations of individuals are represented by sets of chromosomes
  -- Fig.25.1
  -- searching in multi-dimensional space of solutions
  -- Fig.25.3
  -- mutation makes random change to gene(s)
  -- Fig.25.4
  -- crossover splits and recombines two chromosomes
- The Standard Method Equates Fitness with Relative Quality
  -- fitness is the probability that chromosome survives to the next generation
  -- "standard method" is individual fitness relative to sum of fitnesses
- To Mimic Natural Selection
  - create initial population, determine fitness, then loop until done:
  - mutate genes to produce new chromsosomes.
  - produce crossovers to produce new chromsosomes.
  - add all new chromosomes to current population.
  - select best of current generation to make new generation.
  - do biased random selection by fitness to complete new generation.
- Genetic Algorithms Generally Involve Many Choices
  -- size of population?
  -- mutation rate?
  -- how to select pairs for crossover?
  -- crossover point?
  -- chromosome duplication allowed?
  -- fitness calculation method?
  -- initial population?
  -- when to stop?
- It Is Easy to Climb Bump Mountain Without Crossover
  -- the book's "mutation only" does hill climbing
- Crossover Enables Genetic Algorithms to Search High-Dimensional Spaces Efficiently
  -- bias selection of pairs for crossover by fitness
  -- tends to combine good traits
- Crossover Enables Genetic Algorithms to Traverse Obstructing Moats
  -- Fig.25.5
  -- crossover can jump across quality surface
  -- does still tend to get stuck around local maxima
- The Rank Method Links Fitness to Quality Rank
  -- using rank breaks away from actual quality measure used
  -- sort individuals by quality
  -- rank sorted individuals (1st, 2nd, ...)
  -- assign 1st a rank fitness of p (e.g., p=2/3)
  -- remainder is 1 - p = 1/3
  -- assign 2nd a rank fitness of p.remainder = p.1/3 = 2/9
  -- remainder is 1/3 - 2/9
  -- assign 3rd a rank fitness of p.remainder
  -- etc.
  -- Fig.25.6
  -- Rank method gives nonzero fitness to all
Survival of the Most Diverse
- The Rank-Space Method Links Fitness to Both Quality Rank and Diversity Rank
  -- keep newly selected chromosomes different from those already selected for population
  -- i.e., reward diversity as well as fitness
  -- use Rank-Space Method to select an individual for new population:
  - sort individuals by quality
  - sort individuals by diversity
    - diversity is sum of inverse squared distances to other already selected candidates (small value is better)
  - sort by sum of quality rank and diversity rank
  - use rank method on result
  - i.e., select best, assign to new population, and repeat.
  - break rank sum ties using diversity
- The Rank-Space Method Does Well on Moat Mountain
  -- Standard method = 155 generations
  -- Quality rank = 75 generations
  -- Rank-Space = 15 generations
- Local Maxima Are Easier to Handle when Diversity Is Maintained
  -- if some individuals are at local maxima, then diversity pushes new individuals away from those maxima.
- What's in a population?
  -- John Koza, Genetic Programming: population of programs
  -- David B. Fogel, "Blondie24": population of checkers playing neural networks

26 Recognizing Objects

Linear Image Combinations
-- recognition by template construction and matching
- Conventional Wisdom Has Focused on Multilevel Description
  -- conventional wisdom is as follows:
  - process brightness changes to form "primal sketch"
  - find suggested surfaces to get 2.5D sketch
  - these are "viewer centered"
  - find volumes suggested by 2.5D sketch to get "volume description"
  - that is "object centered"
  - match for recognition at the volume description level
  -- but can match at primal sketch level
  -- with the right templates
- Images Contain Implicit Shape Information
  -- a few views of polyhedral object combine to give info about vertex positions
  -- e.g., plan and elevations
- One Approach Is Matching Against Templates
  -- "identification model": three images
  -- each image has "feature points"
  -- given an "unknown" image to recognize/classify
  -- using stored images as "templates"
  -- Fig.26.1
  -- simple match is OK if views (rotations) are constrained
  -- but not in general
- For One Special Case, Two Images Are Sufficient to Generate a Third
  -- Fig.26.2
  -- special case: orthographic projection (along z axis)
  -- rotate around y axis
  -- Fig.26.4
  -- need to match points on unknown and model image
  -- as y values dont change and z values arent relevant, use eqn for x only
  -- in general: x_{I_u} = Ax_I₁ + Bx_I₂ (Eqn 1)
  -- i.e., matching points in unknown and two templates are related
  -- need to find A and B.
- Identification Is a Matter of Finding Consistent Coefficients
  -- key idea:
  - Find two points in the unknown that correspond with two points in both template image 1 and template image 2 (I₁ & I₂).
  - Make two versions of Eqn 1 above, and solve for A and B (alpha and beta).
  - Use all other points in the two templates, and the equation using A and B, to predict all other points in the unknown.
  - If predicted points match actual points on unknown then it is the same type as the templates (e.g., Obelisk).
  -- Tables
- The Template Approach Handles Arbitrary Rotation and Translation
  -- special case: rotation about y axis: 2 points and 2 templates
  -- for arbitrary rotation and translation
  -- use model with 3 image templates rotated and translated
  -- need to match four points
- The Template Approach Handles Objects with Parts
  -- yup, it can do that too
- Establishing Point Correspondence
  - Tracking Enables Model Points to Be Kept in Correspondence
    -- Fig.26.12
    -- small object movements allow small image changes
    -- tracking point correspondence is then easier
  - Only Sets of Points Need to Be Matched
    -- you dont need point-point correspondence
    -- only set to set correspondence
  - Heuristics Help You to Match Unknown Points to Model Points
    -- Fig.26.13
    -- use set of points at top or bottom
27 Describing Images
- Computing Edge Distance
  -- find edges in images
  - Averaged and Differenced Images Highlight Edges
    -- Fig.27.1
    -- sharp changes in brightness
    -- noise in image = spurious edges
    -- remove noise first, then find changes
    -- Fig.27.2
    -- image array to average-brightness array (smoothing)
    -- then find 1st and 2nd derivatives
    -- average-brightness array to average-first-difference array
    -- average-first-difference array to average-second-difference array
    -- high rate of change indicates edge
    -- i.e., a zero crossing
    -- Fig.27.3
    -- combine smoothing and two differentations
    -- to single operator (a point-spread function P)
    -- convolve I with P to give O
    -- for 2D
    - Fig.24.1 (p.493)
    - smoothing: convolve with bell-shaped Gaussian function
    - width of Gaussian affects detail found (narrow, more small edges)
    - combined smoothing+differencing = Mexican Hat shape (sombrero)
    - sombreros can be wide and narrow too
  - Multiple-Scale Stereo Enables Distance Determination
    -- Stereo Vision: two images with detected edges
    -- Fig.27.4
    -- need to find distance from cameras to objects (i.e. to edges)
    -- distance to point is inversely proportional to sum of shift in point's position in the 2 images (disparity)
    -- problem: to measure disparity need to find "corresponding features" in the two images
    -- Fig.27.5
    -- to find correspondence:
    - for each horizontal slice through image
    - find nearest neighbors for each zero-crossing fragment in L image
    - find nearest neighbors for each zero-crossing fragment in R image
    - find pairs that are the closest neighbors of each other
    - match found if distance less than threshold tolerance
    -- Fig.27.7
    -- wide sombrero produces fewer lines, hence less ambiguous matching
    -- narrow sombrero gives more precision, but more ambiguity
    -- more precision in disparity gives better distance estimates
    -- use width w/2 sombrero for accuracy
    -- Fig.27.8
    -- confirm matching (disambiguate) using w sombrero results
- Computing Surface Direction
  -- shape --> surface --> shading
  -- how to get shape from shading?
  -- first study shading from surface
  - Reflectance Maps Embody Illumination Constraints
    -- Fig.27.9
    -- light to surface: incident angle
    -- surface to eye: emergent angle
    -- Lambertian surface: brightness depends only on direction of light source
    -- E = rho * cos i (rho is the surface "albedo")
    -- "matte" surface (nonspecular)
    -- Fig.27.10
    -- illuminate Lambertian sphere: isobrightness lines
    -- project lines from sphere to plane
    -- Fig.27.11
    -- make reflectance maps (of cos i values)
    -- new map for each incident+emergent angle
    -- FG plane (map) is // to image plane
    -- (f,g) point on map represents a surface orientation
  - Making Synthetic Images Requires a Reflectance Map
    -- to generate image
    -- have f and g for every point on image
    -- derived from elevation data
    -- have reflectance map for light position
    -- look up every (f,g) in map to get cos i
    -- use appropriate rho
    -- i.e., (f,g) to brightness
  - Surface Shading Determines Surface Direction
    -- brightness to (f,g)?
    -- brightness at point gives curve on FG map
    -- but surfaces vary smoothly
    -- two constraints
    -- need some known values to start with
    -- set unknown f and g values to 0
    -- (f,g) known at occluding boundary of smooth objects
    -- use relaxation procedure to gradually force constraints to apply across surface
28 Expressing Language Constraints
- Natural Language is complex
  -- "I saw the girl in the park with the telescope"
  -- complex structure (nesting is possible)
  -- we can spot incorrect examples (so what?)
  -- ambiguity is common
  - lexical ("bank")
  - syntactic ("the smart girls and boys")
  - pragmatic ("I saw the statue of liberty flying over New York")
- The Search for an Economical Theory
  -- goal: understanding linguistic constraints
  -- seek simple, concise, comprehensive model
  - Could start with Words
    -- Morphemes: pieces of words with meaning
    -- Mary('s), dog(s), (un)do, (dis)inherit
  - You Cannot Say That
    -- unacceptable sentences act as investigative tool
    -- e.g., * The books is red.
    -- constraint is subject/verb agreement
  - Phrases Crystallize on Words
    -- words have categories
    -- words form groups called phrases
    -- NP: Noun Phrase: "the green book"
    -- PP: Prepositional Phrase: "to the library"
    -- VP: Verp Phrase: "kissed the frog"
    -- can have phrases that use/contain phrases
  - Structure is Syntax
    -- can use formal grammar (e.g., CFG)
    -- could parse, could generate
    -- CFG OK for NL?
    -- need some context sensitivity
    -- is the visible, surface structure in the sentence all there is?
    - "John is easy to please"
    - "John is eager to please"
  - Many Phrase Types Have the Same Structure
    -- NP: specifier, noun, PP
    -- PP: specifier, preposition, NP
    -- VP: specifier, verb, NP
  - The X-Bar Hypothesis Says that All Phrases Have the Same Structure
    -- Specifier, head, one or more Complements
    -- IP: Inflection Phrase: adds tense to verb in VP
    -- CP: Complementer Phrase: allows embedded phrases
- The Search for a Universal Theory
  -- claim: the X-bar representation makes important things explicit and exposes natural constraints
  - A Theory of Language Ought to Be a Theory of All Languages
    -- arrangement of specifiers and complements varies by language
  - A Theory of Language Ought to Account for Rapid Language Acquisition
    -- language is learned very quickly
    -- hypothesis is that universal language constraints are "built-in"
    -- people build sentences from meaning using language knowledge
    -- fine tuning of grammar is learned
  - A Noun Phrase's Case Is Determined by Its Governor
    -- Case assignment: how word fits into sentence
    -- Nominative (I, they), Accusative (me, them), Genitive (my, their)
    -- move up X-bar tree to find governing head to determine pronoun's case
  - Subjacency Limits Wh- Movement
    -- X-bar theory shows constraints on forming questions
- Competence versus Performance
  -- Competence: knowledge of a language and its rules/constraints.
  -- Performance: external view of language competence (actual utterances), limited by memory, social context, etc.
Analysis by Reversing Generation Can Be Silly
-- language generation involves many context-specific and hearer specific adjustments
-- grammars normally express competence
Construction of a Language Understanding Program Remains a Tough Row to Hoe
-- language is very flexible and hence hard to study
-- language is complex and grammars are likely to be too
-- we can understand ungrammatical sentence
-- idiomatic/metaphorical usage makes hoeing harder
NLU is very hard
- The word "spinglesquidge" never appears in a sentence.
- I ate the cake with the frosting.
  I ate the cake with the girl.
  I ate the cake with the spoon.
- The sat cat mat on the.
  Curious green ideas sleep furiously.
  The box is in the pen.
  I cut the cake with the tractor.
  Fruit flies like a banana.
- Bob gave Mary a book. He was pleased. She was pleased.
  Bob gave Fred a book. He was pleased.
- The class hurled abuse at the broken teacher.
- Class end soon. You go then.
Engineers Must Take Shortcuts
-- aim at specific, limited tasks
-- task-specific grammar and vocabulary

29 Responding to Questions and Commands

-- engineering approach
-- translate questions/command to database commands
-- use task and domain-specific language
-- constraints on use make it easier
-- other possible approaches: template matching, CFGs,...
-- need grammar, dictionary, knowledge of target repr

Syntactic Transition Nets
-- transition net
-- recursive transition net (RTN)
-- augmented RTN (ATN)
-- top down, goal driven
- Syntactic Transition Nets Are Like Roadmaps
  -- sentence net & other subnets
  -- nets have start and terminal nodes
  -- can traverse link by matching word, word type, of phrase
  -- phrase needs push to subnet
  -- accept input: at end of S net and all words used
- A Powerful Computer Counted the Long Screwdrivers on the Big Table
  -- some links will fail
  -- if all links fail, subnet fails
  -- nested-box diagram
- Full ATN -- add tests to arcs
  -- add actions to arcs
  -- use memory (registers)
Semantic Transition Trees
-- ATN usually organized by syntax
-- STT organized by meaning
-- i.e., non-terminals are semantic not syntactic
-- e.g., Action Object "with" Tool
-- "Cut the paper with the scissors"
-- could be made to handle non-syntactic input
- A Relational Database Makes a Good Target
  -- (Class Color Size Weight Location)
  -- answers retrieved from DB
- Pattern Instantiation Is the Key to Relational-Database Retrieval in English
  -- query schema instantiated using words from sentence
  -- SELECT < object with < values
  -- note use of "> object" and "< object"
- Moving from Syntactic Nets to Semantic Trees Simplifies Grammar Construction
  -- 1-1 correspondence between paths and terminal nodes
  -- terminals can be query building points
- Recursion Replaces Loops
  -- recursion needed in ATNs too
  -- could have Semantic ATN
- Q&A Translates Questions into Database-Retrieval Commands
  -- could also be used to access "canned" help text