CS 4341 B03 - Project 2

Computer Science Department

CS4341 Introduction to Artificial Intelligence
Project 2 - B 2003

Dharmesh Thakkar and Prof. Carolina Ruiz

PART I (Homework). DUE DATE: Monday, November 17, 10:00 am (beginning of Monday's class).
Part II (project). DUE DATE: Friday, November 21, 9:00 pm.
Homework 2 is worth 2.5% and Project 2 is worth 10% of the course grade.

Homework and Project Goal
Part I. Homework Assignment
- Homework Submission
- Homework Grading
Part II. Project Assignment
Graduate Credit Problem

HOMEWORK AND PROJECT GOAL

This project/homework consists of designing and implementing a program that plays a 2-players, board game. It will exemplify the minimax algorithm, and alpha-beta pruning, and the use of heuristic (evaluation/static) functions to prune the adversarial search. Usage of any other exisiting AI techniques (e.g. machine learning, search strategies) and/or techniques that you develop for this project along with the Minimax algorithm will earn extra credit and potentially improve performance in the tournament.

The goal of this project and homework matches the following Course Objectives:

Learning to apply course material (to improve thinking, problem solving, and decision) during the design, implementation, and analysis of computer programs that reason and/or act intelligently.
Learning fundamental principles about and generalizations of computational problem solving strategies.
Developing creative capacities for the design, implementation, and analysis of computer programs that reason and/or act intelligently.
Learning to analyze and experimentally evaluate designs and implementations of the intelligent computer programs, in particular those developed during the course projects.

PART I. HOMEWORK ASSIGNMENT

Assume that your are playing 2 dimensional tic-tac-toe on a 3x3 board. You are the "X" player, it is your turn to move, and the current board configuration is depicted below:

              |   |
              |   | O
              |   |
           -----------
              |   |
            X | O | O
              |   |
           -----------
              |   |
              |   | X
              |   |

Assume also that the utility function is given by:

                                              /
                                              |  10,  if "X" wins
  utility of a terminal board configuration = |   0,  if it is a draw
                                              | -10,  if "O" wins
                                              \

(30 points) Follow the steps of the minimax algorithm to expand the full adversarial search tree, propagate the utility function values up the tree.
- (5 points) Based on the minimax expansion of the search tree, what is optimal move for your turn?
(40 points) Starting from scratch, follow the steps of the minimax algorithm with Alpha-Beta pruning to expand the adversarial search tree as necessary, propagating the utility function values up the tree.
- (2 points) Based on the minimax plus alpha-beta pruning expansion of the search tree, what is optimal move for your turn?
- (3 points) Could have this move been different to the move you obtained on part 1. based on the full minimax without alpha-beta pruning expansion of the search tree? Explain your answer.

(30 points) Starting from scratch, follow the steps of the minimax algorithm with Alpha-Beta pruning to expand the adversarial search tree ONLY 2 plies (that is, 1 ply for your move and 1 ply for your opponent's response). Use the following heuristic evaluation/static function to evaluate the (partial) board configurations you obtain:

                                                   /
                                                   |  10,  if B is terminal and "X" wins
  evaluation function of a board configuration B = |   0,  if B is terminal and it is a draw
                                                   | -10,  if B is terminal and "O" wins
                                                   |g(B),  if B is not terminal 
                                                   \


  where g of the board configuration B is given by:


   g(B) = 2*(number of lines containing 2 "X"'s and zero "O"'s )
          + (number of lines containing 1 "X" and zero "O"'s )
          - 2*(number of lines containing 2 "O"'s and zero "X"'s )
          - (number of lines containing 1 "O" and zero "X"'s )

  where "line" is any column, row, or diagonal on the board.

(2 points) Based on this heuristic expansion of the search tree, what is best move for your turn?
(3 points) Could this move be different to the move you obtained on part 1. based on the full minimax expansion of the search tree? Explain your answer.

HOMEWORK SUBMISSION

The homework is due at 10 am on Monday November 17, 2003. Please hand in one HARDcopy of your group solutions to the homework assignment by the beginning of the class on Monday, Nov. 17.

HOMEWORK GRADING

Already included in the homework description.

PART II. PROJECT ASSIGNMENT

This project/homework consists of designing and implementing a program that plays 3 Dimensional Tic-Tac-toe. It will exemplify the minimax algorithm, and alpha-beta pruning, and the use of heuristic (evaluation/static) functions to prune the adversarial search. Usage of any other exisiting AI techniques (e.g. machine learning, search strategies) and/or techniques that you develop for this project along with the Minimax algorithm will earn extra credit and potentially improve performance in the tournament.

GAME DESCRIPTION

Tic-Tac-Toe is a two player game (one of them being your computer program). We would deviate from the classical game which is in 2 dimensions and instead implement a 3-dimensional version of the game. We would have a 3D board which is a 4 x 4 x 4 grid. At the start of the game all cells in the grid are empty. Players take turns placing their pieces on the board until one of the player wins. The player who first manages to place 4 of his symbols/pieces in a straight line on the 3D-board is the winner.

Symbols

Player 1 uses Red pieces and Player 2 uses Blue pieces.

Board

We will use an 4x4x4 board. We will label the the board as

Moves

To initiate the game the user needs to enter the names of the appropriate executables in the appropriate space provided by the Refereee.
The program specifed in the Player 1 textbox would be allocated the Red pieces.
The moves need to be in the format (Symbol, x, y, z) where Symbol is either R(for Red Pieces) or B(for Blue Pieces) and 0<=x<4,0<=y<4 and 0<=z<4
Any move which places the user's piece on any empty cell on the 3D-board is a legal move.
Moves alternate between Red pieces and Blue pieces.
Each player gets a minute to play its move. If a player fails to do so the other player automatically wins.

Game Rules

A game will consist of a sequence of the following actions:

The program specified as player 1 will use the Red pieces and gets to play first.and player 2 uses the Blue pieces.

After that, the players take turns moving. A move (S,x,y,z) must satisfy the following constraints:

Time Limit There is a 60 second time limit for a player to make his move. (user time, not CPU time.)

Move preconditions

The symbol x,y,z is a valid board location i.e. 0<=x<4,0<=y<4 and 0<=z<4

The symbol S should be the symbol allocated to the player i.e. either 'R' or 'B'.

The cell x,y,z is not already occupied.

Move postconditions

After the move, the place at x,y,z on the board will be occupied by the symbol S.

It will be the other player's turn.

The game ends in any of the following cases:

A player after making a move has 4 of his symbols (either R or B) in a straight line and thus wins.

The board is full and no player has 4 of his symbols in a straight line. Then the game ends in a draw.

A player makes an illegal move - the other player wins.

A player fails to move within the time limit - the other player wins.

PROJECT ASSIGNMENT

The project assignment consists of the following:

Your group must implement a program that plays 3D Tic-Tac-Toe.
Each group needs to pick a one-word (less than 10 characters) name for its program.
Your program must use the minimax algorithm with alpha-beta pruning.
- Use a utility function (i.e. static evaluation) of your choice.
If your minimax with alpha-beta pruning cannot expand the whole search tree in the time allowed, your program should use a game strategy:
- Use a heuristic/evalution function of your choice (remember that the evaluation function and the utility function must coincide on final board configurations). You are encouraged to use machine learning techniques here to obtain a good heuristic function.
- Use a heuristic of your choice to avoid expanding the whole minimax tree. You can choose from those covered in class (progressive deepening, heuristic continuation (singular-extension heuristic), and tapered search) or another one of your preference. We highly recommend using the progressive deepening heuristic.
The better your evaluation function and your heuristic, the better your program will play, and the better your grade in the project.
Your program must read input from standard input and write output to standard output and each message should be terminated with a '\n'. Be careful here as the behavior might depend on the high level language you are using.
Time saving tips:
For C you need to append a '\n' to the printf.
For Java use System.out.print(Message + '\n'). System.out.println does NOT work.
Your programs must write messages of length 10 or less and must adhere to the protocol stated below within this section.
Your program starts off by reading from the standard input the symbol ('B' or 'R') allocated to it by the referee. That indicates to your program whether it is player 1 or player 2.
Your programs must produce only valid moves. To make a move your program must ouput the move to the standard output as follows:
R,0,1,1\n or B,0,3,3\n
The following are sample invalid moves:
- r,0,1,1\n
  Because the symbol should either be character 'B' or 'R'.
- R,0,4,3\n
  Because though the board is 4 X 4 X 4 in size the locations are accessed by using subscipts from 0 to 3.
- R,0,1,1
  Because the terminating \n is missing. Be careful how you handle this scenario. You might have to experiment a bit with your code before you can get things to work with the referee code that would be provided. This is because different high languages support different kind of input output handling.
  e.g. In C to flush data to the standard output we can terminate the printf with a newline character '\n'.
  But for the same code if we have Java we need to use System.out.print and not System.out.println as the final newline is sent as a separate message to the referee.
  What might be a good idea is to pad your message to the MAX Message size of 10 as follows:
  R,1,1,1\0\0\n. This removes any chances of error as the referee would read in messages from player programs in blocks of size 10.
Your program must verify that the opponent's move is valid, and then use it to calculate your next move. If the opponent's move is invalid, your program must notify the referee that the last move received was invalid. In order to do so your program must write to standard output as follows:
ERROR\n
On receiving the opponent's move your program must detect if the game has come to an end. If so it must notify the referee of the same by printing to standard output the following message:
GAMEOVER\n
Your program must also check if the move it's about to make would win the game. In that case your program must output the move to the standard output and in addition also notify the referee that it knows that with that move the game has ended. Sample:
R,3,3,3\0\0\n
GAMEOVER\0\n
After reporting an "ERROR\n" or a "GAMEOVER\n" your program MUST wait for an "exit\n" before terminating. Also at any time through the program execution if your program receives "exit\n" while it is expecting the opponent's move, it still MUST terminate your process.
Your program must follow the specification given above so that it can interface with other groups's programs.

PROGRAM COMMUNICATION IN THE TOURNAMENT

The intent of the design of the communication between your program and the referee is that this method will result in the least elaborated code, and that it allows you to implement your program in your choice of languages. Also the debugging is simpler as you read moves from the standard input and write output to standard output and thus you could run your code as a standalone program.
Make sure any debugging messages you have you write them to the Standard Error stream as the messages you write to the standard output are strictly for communicating with the referee and must follow the specified format.

Each group needs to pick a one-word name for its program. We'll refer to that name as groupname in what follows.
Each group must submit the code of its program and an executable (or script) named groupname. The group is also responsible for providing clear directions specifying how the executable could be built by using the code. A makefile/script to do so would be highly appreciated.
To have your creation considered for a grade, it must successfully compile and run on a CCC UNIX machine.
A referee program, described below, will take as input the two executable program names representing the playing groups .The referee will then instantiate/execute) the programs.
The program groupname will read input from standard input and output its moves to standard output.
At the beginning of the game, the groupname would read from the standard input its symbol. If it reads a 'R' it gets to make the first move. If it reads a 'B' it would wait for the opponent to make the first move.

A sample of the communication which might take place between the Referee and your programs is depicted below. The case depicted here is one in which the programs for 2 groups exchange moves and the Group1 wins after n moves.

Group1 writes the first move (say R,1,1,1) to the standard output.
The referee would validate the move, update the UI and communicate the move to the standard input of Group2 .
The Group2 should then reads this move and make it's own move (say B,1,3,1) without exceeding the time limit.
The programs and the referee keep exchanging moves till group 1 sends the n^th move (R,3,3,3).
The Group1 realising it has won also sends the GAMEOVER message to the referee.
The referee sends the move to Group2 to check if Group2 is able to figure out that the opponent has won.
Group2 realising that Group1 has won sends out the GAMEOVER message to the referee.
The referee after receiving GAMEOVER messages from both players, sends "exit" message to both the players indicating that the player processes should terminate.

* Note that a program will know that it is its turn by reading moves from the standard input.
**Each program would get 1 minute(60 seconds), to respond with a move, after the opponent's move has been made available to the standard input

REFEREE

A referee program is provided. This referee program is in charge of deciding which program goes first, giving each program's turn to the other program, displaying the board configuration, detecting illegal moves, and enforcing time limits.

Here are some programs for download:

The referee files.(In zip and .tar.gz formats).
A script human, with required Java code is also provided, to facilitate students to play against their own program. Changing the name of one of the programs to "human" in the Referee's interface helps a user to play interactively with a program or another human.

Specifically, the referee program will be in charge of:

Displaying the board configuration after each move.
Giving turns to each of the playing programs. By reading moves from one player program and making it available on the standard input of the other.
Timing each of the players. It starts measuring the time for groupname's move just after placing the move on the standard input of the groupname program. If a program exceeds its time limit, the referee program will stop the game, announcing that the offending program has been timed out. (Note that a program should not time its opponent.)
Detecting invalid moves. If the referee program finds an invalid move, it will make a note of the error. It would expect the opponent to notice the same. After the opponent's response the referee would stop the game, announcing that the offending program has lost and whether or not the winner recognized the invalid move.
Ending the game when a player has won or all cells on the grid are occupied announcing that the game ended in a draw.

TOURNAMENT

ALL groups who want to participate in the tournament must submit the tournament version of their code via turnin with project name 'tournament' (even if you already submitted working code). The deadline for making the submissions is Sunday, Dec 14 2003 by 12:00(Noon).
Your program must not consume computing resources during your opponent's turn. In this spirit, when it is your opponent's turn, your program should simply wait for your opponent's move. Your process must block.
Both players will be executed on the same machine, and may not spawn processes on any other machine.
Time and Date: 5 p.m. on Sunday Dec 14,2003
Place: CCC (II Floor Fuller Labs).
Points: Your group can earn bonus points.
- Successfully completing at least one game will give you 5 bonus points
- For each game after that: (win = 2pts, draw = 1pt, loss = 0)
- These bonus points will be added to your project #2 score.
- If the bonus point scheme changes, it will only be in the positive direction.
Tournament structure
- The tournament structure will be determined when we know how many group will participate, but every group is guaranteed at least 3 games.
- The more games your program wins, the more games your program will play.
- Unless one of the finalist teams object, championship game will hopefully be held in class.
Groups
- Please email us at cs4341-staff AT cs.wpi.edu with any additions, deletions, or name changes of group members or groups from previous projects.

REPORT SUBMISSION AND DUE DATE

You must submit code documentation along with the code. It must follow the Departmental Documentation Standard (see http://www.cs.wpi.edu/Help/documentation-standard.html). Project 2 is due on November 21, 2003 at 9 pm. One (and only one) member of your group must submit the following files using the turnin program. The name of the turnin directory is proj2.

Documentation

the names of the members of your group

FOR EACH GROUP MEMBER, describe in detail the precise contribution of the group member to the overall project under submitted.

instructions on compiling and running your program

the static evaluation your program uses,

the heuristics employed,

strategy used for minimax and alpha beta pruning,

results:

describe which tests you ran to try out your program. Did you program play against human players? Did you program play against other programs? How did your program do during those games?

describe the strengths and the weaknesses of your program.

a discussion of why the evaluation function and the heuristic are good choices.

Finding a good static evaluation function and a good heuristic depends heavily on the experience you have with the game. I recommend you start playing the game and getting the flavor of what a good Tic-Tac-Toe strategy is. Also, research the strategies that other people have used for this and other games. Lots of information can be found in the references listed in your syllabus, the web, and magazine articles.

Program.
1. The source code for your program
2. See more information about the turnin command on http://cs.wpi.edu/Help/turnin.html

Any auxillary files that your program requires, such as a script to run a language interpreter.

OTHER RESOURCES

Although you are welcome to look at code and systems available online to guide the design of your program, you MUST submit your own original code.

3D Tic-Tac-Toe. http://www.pcworld.com/downloads/file_description/0,fid,6324,00.asp
High Dimensional Tic-Tac-Toe. http://www.kri.ch/aigs/supertic.htm

GRADING CRITERIA:

 35 points minimax implementation 
 15 points alpha-beta pruning
 15 points heuristic evaluation function, i.e. function
           that assigns a number to any intermediate board
           configuration.
 15 points heuristic strategy to avoid expanding the whole
           minimax tree so that a move is produced within
           the time limit.
 10 points playing "reasonably well" - showing evidence
           of good offensive and defensive behavior.
 10 points interacting well with referee + good documentation
----------
100 points

+ Extra points for using machine learning (decision trees,
   neural networks, genetic algorithms, etc.) to make your
   system better at playing the game

+ Extra points for good performance in the tournament

GRADUATE CREDIT PROBLEM:

No graduate part required this time. Just do your best job in getting the best playing system you can.

CS4341 Introduction to Artificial Intelligence Project 2 - B 2003

Dharmesh Thakkar and Prof. Carolina Ruiz

Symbols

Board

Moves

Game Rules

CS4341 Introduction to Artificial Intelligence
Project 2 - B 2003