IMGD 2905 Project 2

Mazetool Analytics

Due date: Monday, April 1st, 11:59pm

[Mazetool]


The goal of this project is to use a complete game analytics pipeline on a simple game with a level of your design. This will illustrate a common process in game development - analyzing player behavior/performance in game at various depths. You will design a level (a maze), get players to play your maze (as well as play others' mazes), analyze player data for your maze and compare player performance for your maze with other mazes. Results will be presented in a report.


Top | Part 0 | Part 1 | Part 2 | Part 3 | Part 4 | Writeup | Hints | Submit | Grade

Part 0 - Level Design, Play and Analysis Tools

Level Design

For this part, you will design a level (a maze) and play your classmates' levels (mazes).

Read the Mazetool documentation (docx, pdf).

Using Chrome or Firefox (results with other Web browsers can vary) visit:

https://ps3.perlenspiel.net/examples/mazetool/game.html

and type in your WPI username (without the @wpi.edu extension). Press enter.

Play some (3 to 5) randomly generated mazes to get the feel of the game. Design a few (2 to 3) custom mazes and play them to get a better feel for how maze design can impact the experience. For several mazes you complete, take a look at the data that is generated and sent by email to have a better understanding of the information that is collected each game.

When ready, design one maze that provides for a player experience that you think will be interesting. You should have a level-design experience in mind - e.g., a long maze, a tricky maze, both easy and tricky parts in the maze, a maze with backtracking, a direct-line maze, .... Playtest your maze yourself several times to help ensure you get the experience intended. Be sure to have your email (WPI username) entered and click "save". This will allow your maze to be loaded by others.

In preparation for your report:

A. take a screenshot of your maze, showing the player's starting location, all the gold pieces and the exit (cropping out the Web browser so you have just the maze). You will include this picture in your final report.

In preparation for your analysis:

  1. Determine the shortest path to complete your maze in spaces.

  2. Determine the fewest number of clicks needed to complete your maze.

  3. Determine the shortest time needed to complete your maze by either: a) measuring by running the maze yourself as fast as possible, or b) determining the fewest number of spaces and estimating time based on speed.

  4. Based on #3, decide on how short a time (e.g., 1 minute) is needed to complete your maze to count as a "win".

Play

When and where indicated by the professor, play the maze of every other student in the class.

Open your Chrome/Firefox Web browser to:

https://ps3.perlenspiel.net/examples/mazetool/game.html

From the list of usernames provided by the professor during the class session, enter the username of the first person on the list. (Tip! If an error is reported, you likely entered the wrong name. In that case, retry.) Touch the path to start to play, completing the maze as quickly as possible. Do this just once.

When you complete the maze (you reach the exit), refresh your Web browser (hit the refresh button or hit F5). Then, move on to the next username.

Repeat the above for everyone on the list (except yourself, of course).

Analysis Tools

To analyze the mazetool output data, you will use Python and a spreadsheet (e.g., Microsoft Excel).

Follow the instructions in the setup Python document to get Python installed and ready.

When done, examine the below parse.py Python script which parses the Mazetool comma separated value (csv) output. You are encouraged to login to your Jupyter account and:

  1. Create a new folder called mazetool-data. Upload the example file sample.csv to this new folder.

  2. Create a new Python Notebook (select "New" --> "Python 3").

  3. Into your Notebook, paste in the below script. Modify the variable DIR to be the location of your sample.csv file (e.g., mazetool-data).

#
# parse.py - parse Mazetool output file(s).
#
# version 2.0
#

# Needed imports.
import csv
import os

DIR="change-to-your-dir-name"  # e.g., mazetool-data

# Repeat (loop) for every file in directory
for f in os.listdir(DIR):

    print("---------------------------------")

    # Only handle .csv files.
    if not f.endswith(".csv"):
        print("Ignoring:", f)
        continue

    # Print file information.
    filename = DIR + "/" + f
    print("File:", f)
    print("Full path:", filename)
    
    # Print out gold events.
    print("Gold (time number):")
    with open(filename, 'r') as csvfile:   # read from file
        reader = csv.DictReader(csvfile)   # treat as csv file
        for row in reader:
            if (row['gold'] is not ''):
                print(row['time'], row['gold'])
    print("-----------");
                
    # Print out click events.
    print("Clicks (time spaces):")
    with open(filename, 'r') as csvfile:   # read from file
        reader = csv.DictReader(csvfile)   # treat as csv file
        for row in reader:
            if (row['click'] is not ''):
                print(row['time'], row['click'])
    print("-----------");
                            
    # Print out exit event (it is always the last line in file).
    print("Exit (time spaces):")
    print(row['time'], row['exit'])
  1. Run your Notebook. It should produce the following output:
    Gold (time number):
    01.586 1
    01.786 2
    16.404 3
    18.806 4
    23.911 5
    32.215 6
    35.217 7
    37.719 8
    38.119 9
    38.518 10
    -----------
    Clicks (time spaces):
    00.800 0
    12.653 10
    17.255 48
    20.787 64
    25.783 96
    28.885 127
    31.150 131
    33.036 142
    36.191 164
    1:10.071 188
    -----------
    Exit (time spaces):
    1:15.022 238

This same script can work on any Mazetool output.

  1. Study the script carefully and modify it and re-run it as needed to gain a deep understanding of how it works.

  2. For your analysis, copy, extend and modify it to provide the data you need.

Write Data to File

One extension to the script you will likely use is to write data you want to analyze (e.g., by Excel in a chart) to a file. This can be done in Python fairly easily. Below is some sample code showing one way to do this. There are others that you can find by searching the Web.

#
# write.py - Show basic csv file writing.
#
# version 1.1
# 

import math  # needed for sqrt()

# Output directory and file name.
DIR="."  # '.' means current directory.  Or try, e.g., mazetool-data
FILE="basic.csv"

# Write some numbers to file with commas (i.e., a csv).
filename = DIR + "/" + FILE
with open(filename, 'w') as csvfile:

  # Print header.
  print ("Loop, Square, Square Root,", file=csvfile)
  # Repeat (loop) for numbers 1 to 10.
  for i in range(1, 10):
    
    # Print numbers: integer, integer, float.
    print ("%d, %d, %f," % (i, i*i, math.sqrt(i)), file=csvfile)

# Note, file closes automatically here.

Note, if you cut and paste the above script to your Python Notebook (either a new one or the one you have created) it should work, but you won't see anything printed on the screen for output. Instead, the script will have created a file called "basic.py" you can open in your Jupyter account by double-clicking on it or selecting, downloading and opening with Excel.

Time in Seconds

Note that the Mazetool output is in seconds only (e.g., 36.191) when under 1 minute, but when over one minute contains the minutes followed by a colon and the seconds (e.g., 1:10.071). The times can be converted to just seconds in all cases with code similar to:

# Convert Mazetool time format to seconds.
t = row['time']
if ":" in t:  # time went over one minute.
    (m, s) = t.split(':')             # split into minutes and seconds
    seconds = int(m) * 60 + float(s)  # convert to total seconds
else:         # time is less than one minute.
    seconds = t
print(seconds)

Note that the above script is not a complete program - you can't just run it by itself (in fact, you will get an error "row not defined" if you try). Instead, you can use that script as part of another script. For example, try using the script at the end of parse.py to print the exit time in seconds.


Part 1 - Maze Running

From the data files for people that played your maze, select two example runs that illustrate progression through your maze. One should be a "short" run where the player completed the maze quickly, likely with fewer mouse clicks, and one should be a "long" run where the player took longer to complete the maze, perhaps with more mouse clicks.

Analyze the data, providing a time-series chart of distance traveled (in spaces) versus time (in seconds). The distance should be obtained from the mouse click events. Chart trendlines (with lines and points) should be clearly indicated.

[distance-vs-time]

Additional Trend Line

While two separate charts could be shown, better would be to have one chart with two trendlines (as done in the sample chart). In Excel, this may be done by first creating a scatter plot for the first data set (e.g., the "long" run):

"Insert" --> "Scatter Plot" --> "Scatter Plot with Lines and Markers"

Then, adding the second data set (e.g., the "short" run) to the first graph by: 1) selecting the data and copying (ctrl-c), 2) selecting the chart, and 3) choosing:

"Home" --> "Paste" --> "Paste special"

making sure "Categories (X Values) in First Column" is selected.


Part 2 - Win Tally

From all the data files for people that played your maze, tally up the number of people that "won" (completed your maze below the time you had specified as a winning time).

Make a pie chart of the number of wins and the number of losses.

[win-tally]

To do this analysis, you will need data in the form:

Time,
115.396,
45.022,
46.924,

where each row is the exit time (the time the maze was completed), in seconds.

Once the time csv data is imported into Excel, the wins and losses can be tallied. In a separate column, provide an IF formula that checks if the time is less than the winning time. For example, if the winning time is 30 seconds or under, the formula to check the winning time in cell A1 would be:

=IF(A1 < 30, "yes", "no")

Copy and paste this formula along the column for each row.

Then, use a COUNTIF formula to count up the "yes" and the "no" values. Something like:

=COUNTIF(B1:B3, "yes")

For example, a spreadsheet with the analysis of 3 maze times (15.396, 45.022, and 76.924 seconds), might look like:

[Spreadsheet Pie Sample]

A pie chart can be made by selecting the "yes" and "no" rows and the tally column (e.g., C1:C2 to D1:D2 in the example) and then:

"Insert" --> "Insert Pie" --> "2-D Pie"


Part 3 - Maze Time

Analyze the completion times for all the people that played your maze via a cumulative distribution chart. In addition, analyze and compare the distribution of completion times of your maze to the distribution of completion times of all other mazes (i.e., do not include your maze data in the "all" maze data).

The time data is the same as in part 2.

Time,
115.396,
45.022,
46.924,
...

But for this part, draw a cumulative distribution chart of the times.

[times-distribution]

To create a cumulative distribution chart:

  1. Sort the data from low to high.
  2. Compute the percent in an adjacent column with =row()/count * 100 where count is the number of data rows.
  3. Select both columns (data and percent).
  4. Create a chart: "Insert" --> "Scatter Plot" --> "Scatter Plot with Lines and Markers"

A trend line for the second distribution (e.g., "all") can be added by the same method used in part 1 to add a trend line to a time series chart.

Compare observed times for your maze with the "best" times you determined in Part 0.


Part 4 - Maze Analysis

Do some comparative analytics on individual mazes, including your own, compared to all mazes.

First, for all mazes and all runs (including your own), find the minimum, maximum, mean, median and mode for:

  1. Number of clicks made
  2. Time to complete
  3. Number of spaces traveled

Report the results in a table.

For the variables of clicks, time and spaces, draw radar charts comparing your maze average clicks, average time and average spaces to the average clicks, average time and average spaces for all mazes.

In order to draw a radar chart in Excel, data needs to be in the form:

Clicks, Spaces, Time,
     7,    128,   72,

Once in Excel, select the both rows and all three columns and:

"Insert" --> "Insert Waterfall, Funnel, Stock, Surface, or Radar Chart" --> "Radar"

(Note, the above command may need to be adjusted depending upon your Excel version.)

In doing so, however, the comparative charts will not be very effective! This is because the number of clicks is typically far fewer than either spaces or time. To remedy this, normalize each dimension each average by dividing by the maximum (previously computed for the table), i.e., clicks / max_clicks. This will produce a number from 0 to 1 that is comparable across variables. Data will be similar to:

Clicks, Spaces, Time,
0.428,  0.714, 0.98,

A radar plot drawn with the normalized data will be comparable.

[radar-compare]

Select three other mazes, randomly or based on your own interest (e.g., your friends). Draw similar radar charts for each, being sure to clearly identify which maze each chart came from.

Compare observed average clicks and average spaces for your maze with the "best" times you determined in Part 0.


Hints

The entire Mazetool data set for all players on all mazes is available at:

https://web.cs.wpi.edu/~imgd2905/d19/projects/proj2/mazetool-data.zip

For your analysis, you should not aggregate characteristics about the data set (e.g., number of people, number of mazes played, etc).

Uploading multiple files (e.g., 300+ or more Mazetool csv files) can be tedious. See the Python setup tips for how to unzip (and zip) multiple files in a Notebook.

For many kinds of analytics, including game analytics, organization is key. Paying special attention to filenames - raw data, scripts and csv data - will pay dividends as the project progresses and gets more complicated. This is especially important if you ever have to re-visit your analysis, something that is quite common in practice.

With this in mind, some suggestions on keeping organized:

  1. Make small, individual scripts that provide data for one part of the needed analysis. For example, a script may just pull out all the completion (exit) times from a series of files. Nothing more. This could be used for Part 1.

  2. Have comments for each script that has a name and says something meaningful about what it does . For example, "exit-times - records exit-times".

  3. Have any output csv files produced by the script use the same name as the script. For example, exit-times.csv.

  4. For spreadsheet analysis, have a separate file for each part in the analysis or, alternatively, a separate sheet for each part of the analysis. Name the file (or sheet) with the same name used for the script and the data. For example, exit-times.xlsx.

  5. Have a brief README.txt file for your own notes that provides a one-line description of what each script does.

When embedding charts in a report, fonts may often shrink to the point they are not readable! To avoid this, as a guideline, compare the size of text inside a chart to the size of the text in the paper. They should be similar in size. If the chart text size is too small or way too big), go back to the original chart and choose a font size that results in a final font size that more proportional to the paper font. Note, this may require adjusting other aspects of the chart, such as axis tick marks and spacing.

The aspiring Python programmer might want to have an easier way to use the code to compute time in seconds. In general, small pieces of code like this can be separated into a "block" of code, called a function. With a function, you could write something like, for example, seconds = getSeconds(row['time']) to get the number of seconds, regardless of whether the format is pre-pended by minutes or not. Tutorials to make functions can be found online; one such document is:

https://www.tutorialspoint.com/python/python_functions.htm

Many of the grading comments applied to Project 1 are general and pertain to Project 2 as well. You should review the comments made to your Project 1 report and make sure not to incorporate needed changes into your Project 2 report.

You should also check out the Postmortem Feedback on Graded Project 1s for general guidelines that also pertain to this Project 2.


Writeup

Writeup a short report.

For Part 0 (Level Design, Play and Analysis Tools), include details on your maze, describing the high level experience, showing a screen shot of your maze, and providing data on the "win" condition, shortest path, fewest clicks and fastest time estimates.

For each other part of the project, provide a brief section on the analysis in clearly labeled sections (e.g., Part 1 - Maze Running). Include a brief description of the methodology, particularly as it may relate to the results obtained.

All guidelines for presenting and describing charts should be adhered to.


Submission

The assignment is to be submitted electronically via Canvas by 11:59pm on the day due.

The submission is a report in PDF, named:

    proj2-lastname.pdf

with your name in place of "lastname" above, of course.

To submit your assignment (proj2-lastname.pdf):

Open: IMGD2905-D19-D01
Navigate to: Assignments -> Project 2
Click: Submit Assignment
Click: Choose File
Select the pdf file: proj2-lastname.pdf
Click: Submit Assignment

Important - you must click the Submit Assignment button at the end or your file will not be submitted!

When successfully submitted, you should see a message similar to:

Submission
- Submitted!
Apr 1 at 11:52pm


Grading

All accomplishments are shown through the report. The point break down does not necessarily reflect effort or time on task. Rather, the scale is graduated to provide for increasingly more effort required for the same reward (points).

Breakdown

Part 0 - 10% : Building a maze and playing everyone else's maze.

Part 1 - 35% : Time series chart showing short and long maze runs.

Part 2 - 25% : Pie chart showing "win" fraction.

Part 3 - 20% : Cumulative distribution charts of maze times.

Part 4 - 10% : Table of maximums and radar charts comparing mazes.

Rubric

100-90. The submission clearly exceeds requirements. All Parts of the project have been completed or nearly completed. The report is clearly organized and well-written, charts and tables are clearly labeled and described and messages provided about each Part of the analysis.

89-80. The submission meets requirements. Parts 0-3 of the project have been completed or nearly completed, but perhaps not Part 4. The report is organized and well-written, charts and tables are labeled and described and messages provided about most of the analysis.

79-70. The submission barely meets requirements. Parts 0-2 of the project have been completed or nearly completed, and some of Part 3, but not Part 4. The report is semi-organized and semi-well-written, charts and tables are somewhat labeled and described, but parts may be missing. Messages are not always clearly provided for the analysis.

69-60. The project fails to meet requirements in some places. Parts 0-1 of the project has been completed or nearly completed, and some of Part 2, but not Parts 3 or 4. The report is not well-organized nor well-written, charts and tables are not labeled or may be missing. Messages are not always provided for the analysis.

59-0. The project does not meet requirements. Besides Part 0, and maybe Part 1, no other part of the project has been completed. The report is not well-organized nor well-written, charts and tables are not labeled and/or are missing. Messages are not consistently provided for the analysis.

Postmortem Feedback on Graded Projects

The comments below are in response to graded projects. They are not provided in any particular order.


Top | Part 0 | Part 1 | Part 2 | Part 3 | Part 4 | Writeup | Hints | Submit | Grade

exit Return to the IMGD 2905 home page

Questions: imgd2905 question-answer forum