CS4341 Introduction to Artificial Intelligence
Project 2 - B 2003
PART I (Homework). DUE DATE: Monday, November 17, 10:00 am (beginning
of Monday's class).
Part II (project). DUE DATE: Friday, November 21, 9:00 pm.
Homework 2 is worth 2.5% and Project 2 is worth 10% of the course grade.
This project/homework consists of designing and implementing a program
that plays a 2-players, board game. It will exemplify the minimax algorithm,
and alpha-beta pruning, and the use of heuristic (evaluation/static) functions
to prune the adversarial search. Usage of any other exisiting AI techniques
(e.g. machine learning, search strategies) and/or techniques that you develop
for this project along with the Minimax algorithm will earn extra credit
and potentially improve performance in the tournament.
The goal of this project and homework matches the following Course
Objectives:
- Learning to apply course material (to improve thinking, problem
solving, and decision) during the design, implementation, and analysis
of computer programs that reason and/or act intelligently.
- Learning fundamental principles about and generalizations of
computational problem solving strategies.
- Developing creative capacities for the design, implementation,
and analysis of computer programs that reason and/or act intelligently.
- Learning to analyze and experimentally evaluate
designs and implementations of the intelligent computer programs,
in particular those developed during the course projects.
PART I. HOMEWORK ASSIGNMENT
Assume that your are playing 2 dimensional tic-tac-toe on a 3x3 board. You
are the "X" player, it is your turn to move, and the current board configuration
is depicted below:
| |
| | O
| |
-----------
| |
X | O | O
| |
-----------
| |
| | X
| |
Assume also that the utility function is given by:
/
| 10, if "X" wins
utility of a terminal board configuration = | 0, if it is a draw
| -10, if "O" wins
\
- (30 points) Follow the steps of the minimax algorithm
to expand the full adversarial search tree, propagate the utility
function values up the tree.
- (5 points) Based on the minimax expansion of
the search tree, what is optimal move for your turn?
- (40 points) Starting from scratch, follow the steps
of the minimax algorithm with Alpha-Beta pruning to expand the adversarial
search tree as necessary, propagating the utility function values
up the tree.
- (2 points) Based on the minimax plus
alpha-beta pruning expansion of the search tree, what is optimal
move for your turn?
- (3 points) Could have this move been different
to the move you obtained on part 1. based on the full minimax
without alpha-beta pruning expansion of the search tree? Explain
your answer.
- (30 points) Starting from scratch, follow the steps
of the minimax algorithm with Alpha-Beta pruning to expand the adversarial
search tree ONLY 2 plies (that is, 1 ply for your move and 1 ply
for your opponent's response). Use the following heuristic evaluation/static
function to evaluate the (partial) board configurations you obtain:
/
| 10, if B is terminal and "X" wins
evaluation function of a board configuration B = | 0, if B is terminal and it is a draw
| -10, if B is terminal and "O" wins
|g(B), if B is not terminal
\
where g of the board configuration B is given by:
g(B) = 2*(number of lines containing 2 "X"'s and zero "O"'s )
+ (number of lines containing 1 "X" and zero "O"'s )
- 2*(number of lines containing 2 "O"'s and zero "X"'s )
- (number of lines containing 1 "O" and zero "X"'s )
where "line" is any column, row, or diagonal on the board.
- (2 points) Based on this heuristic expansion
of the search tree, what is best move for your turn?
- (3 points) Could this move be different to
the move you obtained on part 1. based on the full minimax
expansion of the search tree? Explain your answer.
HOMEWORK SUBMISSION
The homework is due at 10 am on Monday November 17, 2003. Please
hand in one HARDcopy of your group solutions to the homework assignment by
the beginning of the class on Monday, Nov. 17.
HOMEWORK GRADING
Already included in the homework description.
This project/homework consists of designing and implementing a program
that plays 3 Dimensional Tic-Tac-toe. It will exemplify the minimax algorithm,
and alpha-beta pruning, and the use of heuristic (evaluation/static) functions
to prune the adversarial search. Usage of any other exisiting AI techniques
(e.g. machine learning, search strategies) and/or techniques that you develop
for this project along with the Minimax algorithm will earn extra credit
and potentially improve performance in the tournament.
Tic-Tac-Toe is a two player game (one of them being your computer program).
We would deviate from the classical game which is in 2 dimensions and
instead implement a 3-dimensional version of the game. We would have a
3D board which is a 4 x 4 x 4 grid. At the start of the game all cells
in the grid are empty. Players take turns placing their pieces on the board
until one of the player wins. The player who first manages to place 4 of
his symbols/pieces in a straight line on the 3D-board is the winner.
Player 1 uses Red pieces and Player 2 uses Blue pieces.
We will use an 4x4x4 board. We will label the the board as
-
Moves
To initiate the game the user needs to enter the names of the appropriate
executables in the appropriate space provided by the Refereee.
The program specifed in the Player 1 textbox would be allocated the Red
pieces.
The moves need to be in the format (Symbol, x, y, z) where Symbol is
either R(for Red Pieces) or B(for Blue Pieces) and 0<=x<4,0<=y<4
and 0<=z<4
Any move which places the user's piece on any empty cell on the
3D-board is a legal move.
Moves alternate between Red pieces and Blue pieces.
Each player gets a minute to play its move. If a player fails to
do so the other player automatically wins.
A game will consist of a sequence of the following actions:
- The program specified as player 1 will use the
Red pieces and gets to play first.and player 2 uses the
Blue pieces.
- After that, the players take turns moving.
A move (S,x,y,z) must satisfy the following constraints:
- Time Limit There is a 60 second time limit for
a player to make his move. (user time, not CPU time.)
- Move preconditions
- The symbol x,y,z is
a valid board location i.e. 0<=x<4,0<=y<4 and 0<=z<4
- The symbol S should be the symbol allocated
to the player i.e. either 'R' or 'B'.
- The cell x,y,z is not already occupied.
- Move postconditions
- After the move, the place at x,y,z
on the board will be occupied by the symbol
S.
- It will be the other player's turn.
- The game ends in any of the following cases:
- A player after making a move has 4 of his symbols
(either R or B) in a straight line and thus wins.
- The board is full and no player has 4 of his symbols
in a straight line. Then the game ends in a draw.
- A player makes an illegal move - the other player
wins.
- A player fails to move within the time limit - the
other player wins.
The project assignment consists of the following:
- Your group must implement a program that plays 3D Tic-Tac-Toe.
- Each group needs to pick a one-word (less than 10 characters)
name for its program.
- Your program must use the minimax algorithm with alpha-beta
pruning.
- Use a utility function (i.e. static evaluation) of your
choice.
- If your minimax with alpha-beta pruning cannot expand the
whole search tree in the time allowed, your program should use a
game strategy:
- Use a heuristic/evalution function of your choice (remember
that the evaluation function and the utility function must coincide
on final board configurations). You are encouraged to use machine
learning techniques here to obtain a good heuristic function.
- Use a heuristic of your choice to avoid expanding the
whole minimax tree. You can choose from those covered in class
(progressive deepening, heuristic continuation (singular-extension
heuristic), and tapered search) or another one of your preference.
We highly recommend using the progressive deepening heuristic.
The better your evaluation function and your heuristic, the
better your program will play, and the better your grade in the project.
- Your program must read input from standard input and write
output to standard output and each message should be terminated with a '\n'.
Be careful here as the behavior might depend on the high level language you
are using.
Time saving tips:
For C you need to append a '\n' to the printf.
For Java use System.out.print(Message + '\n'). System.out.println
does NOT work.
- Your programs must write messages of length 10 or less and must adhere
to the protocol stated below within this section.
- Your program starts off by reading from the standard input the symbol
('B' or 'R') allocated to it by the referee. That indicates to your program
whether it is player 1 or player 2.
- Your programs must produce only valid moves. To make a move your program
must ouput the move to the standard output as follows:
R,0,1,1\n or B,0,3,3\n
The following are sample invalid moves:
- r,0,1,1\n
Because the symbol should either be character 'B' or 'R'.
- R,0,4,3\n
Because though the board is 4 X 4 X 4 in size the locations are accessed
by using subscipts from 0 to 3.
- R,0,1,1
Because the terminating \n is missing. Be careful how you handle this scenario.
You might have to experiment a bit with your code before you can get things
to work with the referee code that would be provided. This is because different
high languages support different kind of input output handling.
e.g. In C to flush data to the standard output we can terminate the
printf with a newline character '\n'.
But for the same code if we have Java we need to use System.out.print and
not System.out.println as the final newline is sent as a separate message
to the referee.
What might be a good idea is to pad your message to the MAX Message size
of 10 as follows:
R,1,1,1\0\0\n. This removes any chances of error as the
referee would read in messages from player programs in blocks of size 10.
- Your program must verify that the opponent's move is valid, and
then use it to calculate your next move. If the opponent's move is invalid,
your program must notify the referee that the last move received was
invalid. In order to do so your program must write to standard output as follows:
ERROR\n
- On receiving the opponent's move your program must detect if the game
has come to an end. If so it must notify the referee of the same by printing
to standard output the following message:
GAMEOVER\n
- Your program must also check if the move it's about
to make would win the game. In that case your program must output the move
to the standard output and in addition also notify the referee that it knows
that with that move the game has ended. Sample:
R,3,3,3\0\0\n
GAMEOVER\0\n
- After reporting an "ERROR\n" or a "GAMEOVER\n" your program MUST
wait for an "exit\n" before terminating. Also at any time through the program
execution if your program receives "exit\n" while it is expecting the opponent's
move, it still MUST terminate your process.
- Your program must follow the specification given above so that it
can interface with other groups's programs.
The intent of the design of the communication between your program and
the referee is that this method will result in the least elaborated
code, and that it allows you to implement your program in your choice
of languages. Also the debugging is simpler as you read moves from the
standard input and write output to standard output and thus you could run
your code as a standalone program.
Make sure any debugging messages you have you write them to the Standard
Error stream as the messages you write to the standard output are strictly
for communicating with the referee and must follow the specified format.
- Each group needs to pick a one-word name for its program.
We'll refer to that name as groupname in what follows.
- Each group must submit the code of its program and an executable
(or script) named groupname. The group is also responsible
for providing clear directions specifying how the executable could be
built by using the code. A makefile/script to do so would be highly appreciated.
To have your creation considered for a grade,
it must successfully compile and run on a CCC UNIX machine.
- A referee program, described below, will take as input the
two executable program names representing the playing groups .The referee
will then instantiate/execute) the programs.
- The program groupname will read input
from standard input and output its moves to standard output.
- At the beginning of the game, the groupname would
read from the standard input its symbol. If it reads a 'R' it gets to make
the first move. If it reads a 'B' it would wait for the opponent to make
the first move.
A sample of the communication which might take place between the Referee
and your programs is depicted below. The case depicted here is one in which
the programs for 2 groups exchange moves and the Group1 wins after n moves.
- Group1 writes the first move (say R,1,1,1) to
the standard output.
- The referee would validate the move, update the UI and communicate
the move to the standard input of Group2 .
- The Group2 should then reads this move and make it's own move
(say B,1,3,1) without exceeding the time limit.
- The programs and the referee keep
exchanging moves till group 1 sends the nth move (R,3,3,3).
- The Group1 realising it has won also sends the GAMEOVER message
to the referee.
- The referee sends the move to Group2 to check if Group2 is able to
figure out that the opponent has won.
- Group2 realising that Group1 has won sends out the GAMEOVER message
to the referee.
- The referee after receiving GAMEOVER messages from both players, sends
"exit" message to both the players indicating that the player processes should
terminate.
* Note that a program will know that it is its turn by reading moves from
the standard input.
**Each program would get 1 minute(60 seconds), to respond with a move, after
the opponent's move has been made available to the standard input
A referee program is provided. This referee program is in charge of deciding
which program goes first, giving each program's turn to the other program,
displaying the board configuration, detecting illegal moves, and enforcing
time limits.
Here are some programs for download:
- The referee files.(In zip and
.tar.gz
formats).
- A script human, with required Java code is also provided, to facilitate
students to play against their own program. Changing the name of one of the
programs to "human" in the Referee's interface helps a user to play interactively
with a program or another human.
Specifically, the referee program will be in charge of:
- Displaying the board configuration after each
move.
- Giving turns to each of the playing programs. By reading
moves from one player program and making it available on the standard input
of the other.
- Timing each of the players. It starts measuring the time
for groupname's move just after placing the move on the standard
input of the groupname program. If a program exceeds its time
limit, the referee program will stop the game, announcing that the
offending program has been timed out. (Note that a program should not
time its opponent.)
- Detecting invalid moves. If the referee program finds an
invalid move, it will make a note of the error. It would expect the
opponent to notice the same. After the opponent's response the referee would
stop the game, announcing that the offending program has lost and whether
or not the winner recognized the invalid move.
- Ending the game when a player has won or all cells on the
grid are occupied announcing that the game ended in a draw.
- ALL groups who want to participate in the tournament must
submit the tournament version of their code via turnin with project
name 'tournament' (even if you already submitted working code). The
deadline for making the submissions is Sunday, Dec 14 2003 by 12:00(Noon).
- Your program must not consume computing resources during
your opponent's turn. In this spirit, when it is your opponent's turn,
your program should simply wait for your opponent's move. Your process
must block.
- Both players will be executed on the same machine, and may
not spawn processes on any other machine.
- Time and Date: 5 p.m. on Sunday Dec 14,2003
- Place: CCC (II Floor Fuller Labs).
- Points: Your group can earn bonus points.
- Successfully completing at least one game will give you
5 bonus points
- For each game after that: (win = 2pts, draw = 1pt, loss
= 0)
- These bonus points will be added to your project #2
score.
- If the bonus point scheme changes, it will only be in
the positive direction.
- Tournament structure
- The tournament structure will be determined when we know
how many group will participate, but every group is guaranteed
at least 3 games.
- The more games your program wins, the more games your
program will play.
- Unless one of the finalist teams object, championship
game will hopefully be held in class.
- Groups
- Please email us at cs4341-staff AT cs.wpi.edu
with any additions, deletions, or name changes of group members or groups
from previous projects.
You must submit code documentation along with the code. It must follow
the Departmental Documentation Standard (see http://www.cs.wpi.edu/Help/documentation-standard.html).
Project 2 is due on November 21, 2003 at 9 pm. One (and only one)
member of your group must submit the following files using the turnin program. The name
of the turnin directory is proj2.
- the names of the members of your group
- FOR EACH GROUP MEMBER, describe in detail the precise
contribution of the group member to the overall project
under submitted.
- instructions on compiling and running your program
- the static evaluation your program uses,
- the heuristics employed,
- strategy used for minimax and alpha beta pruning,
- results:
- describe which tests you ran to try out your program.
Did you program play against human players? Did you program play
against other programs? How did your program do during those games?
- describe the strengths and the weaknesses of your
program.
- a discussion of why the evaluation function and the heuristic
are good choices.
Finding a good static evaluation function and a good heuristic depends
heavily on the experience you have with the game. I recommend you
start playing the game and getting the flavor of what a good Tic-Tac-Toe
strategy is. Also, research the strategies that other people have used
for this and other games. Lots of information can be found in the
references listed in your syllabus, the web, and magazine articles.
- Any auxillary files that your program requires, such as
a script to run a language interpreter.
Although you are welcome to look at code and systems available
online to guide the design of your program, you MUST submit your
own original code.
GRADING CRITERIA:
35 points minimax implementation
15 points alpha-beta pruning
15 points heuristic evaluation function, i.e. function
that assigns a number to any intermediate board
configuration.
15 points heuristic strategy to avoid expanding the whole
minimax tree so that a move is produced within
the time limit.
10 points playing "reasonably well" - showing evidence
of good offensive and defensive behavior.
10 points interacting well with referee + good documentation
----------
100 points
+ Extra points for using machine learning (decision trees,
neural networks, genetic algorithms, etc.) to make your
system better at playing the game
+ Extra points for good performance in the tournament
GRADUATE CREDIT PROBLEM:
- No graduate part required this time. Just do your best
job in getting the best playing system you can.