[Mazetool]

IMGD 2905 Project 2

Mazetool Analytics

Due date: Wednesday, March 29th, 11:59pm


The goal of this project is to use a complete game analytics pipeline on a simple game with a level of your design. This will illustrate a common process in game development - analyzing player behavior/performance in game at various depths. You will design a level (a maze), get players to play your maze (as well as play others' mazes), analyze player data for your maze and compare player performance for your maze with other mazes. Results will be presented in a report.


Top | P0 - Design, Play, Tools | P1 - Maze Running | P2 - Win Tally | P3 - Maze Time | P4 - Maze Analysis | Hints | Writeup | Submission | Grading

Part 0 - Level Design, Play and Analysis Tools

Level Design

For this part, you will design a level (a maze) and play your classmates' levels (mazes).

Read the Mazetool documentation (docx, pdf).

Using Chrome or Firefox (but not Internet Explorer) visit:

http://users.wpi.edu/~bmoriarty/imgd2905/mazetool/cover.html

and play some (3 to 5) randomly generated mazes to get the feel of the game. Design a few (2 to 4) custom mazes and play them to get a better feel for how maze design can impact the experience. For several mazes you complete, take a look at the data that is generated and sent by email to have a better understanding of the information that is collected each game.

When ready, design one maze that provides for a player experience that you think will be interesting. You should have a level-design experience in mind - e.g., a long maze, a tricky maze, both easy and tricky parts in the maze, a maze with backtracking, a direct-line maze, .... Playtest your maze yourself several times to help ensure you get the experience intended. Be sure to have your email (WPI username) entered and click "save". This will allow your maze to be loaded by others.

In preparation for your analysis:

In preparation for your report, take a screenshot of your maze, showing the door, all the gold pieces and the exit (cropping out the Web browser so you have just the maze). You will include this picture in your final document.

Play

When and where indicated by the professor, play the maze of every other student in the class.

Open your Chrome/Firefox Web browser to:

http://users.wpi.edu/~bmoriarty/imgd2905/mazetool/mazeplay.html

Note: This a different URL than the one you used to create your maze!

From the list of usernames provided by the professor during the class session, enter the username of the first person on the list. (Tip! if an error is reported, you likely entered the wrong. In that case, retry.) Touch the path to start to start play, completing the maze as quickly as possible. Do this just once.

When you complete the maze (you reach the exit), refresh your Web browser (hit the refresh button or hit F5). Then, move on to the next username.

Repeat the above for everyone on the list (except yourself, of course).

Analysis Tools

To analyze the mazetool output data, you will use Python and a spreadsheet (e.g., Microsoft Excel). Below is a Python script that parses the Mazetool comma separated value (csv) output. You are encouraged to:

  1. Run the below script to make sure it works. e.g., on Linux
    python3 parse.py sample.csv

    Or, if you are using Thonny, use:

    "Tools" → "Open system shell"
    Then type in "python parse.py sample.csv"

#
# Parse Mazetool output file(s).
# version 1.1
#

# Needed imports.
import csv
import sys

# When running, must specify filename(s) from command line.
# sys.argv is list of command line args.
# len() provides length.
if (len(sys.argv) == 1):  
    print("Usage: parse.py {filename} [filename] ...")
    exit(0)

# Repeat (loop) for every file from command line.   
for i in range(1, len(sys.argv)):
    filename = str(sys.argv[i])

    # Print out gold events.
    print("Gold (time number):")
    with open(filename, 'r') as csvfile:   # read from file
        reader = csv.DictReader(csvfile)   # treat as csv file
        for row in reader:
            if (row['gold'] is not ''):
                print(row['time'], row['gold'])
    print("-----------");
                
    # Print out click events.
    print("Clicks (time spaces):")
    with open(filename, 'r') as csvfile:   # read from file
        reader = csv.DictReader(csvfile)   # treat as csv file
        for row in reader:
            if (row['click'] is not ''):
                print(row['time'], row['click'])
    print("-----------");
                            
    # Print out exit event (it is always the last line in file).
    print("Exit (time spaces):")
    print(row['time'], row['exit'])

    Running it on the example file sample.csv should produce the following output:

        Gold (time number):
        01.586 1
        01.786 2
        16.404 3
        18.806 4
        23.911 5
        32.215 6
        35.217 7
        37.719 8
        38.119 9
        38.518 10
        -----------
        Clicks (time spaces):
        00.800 0
        12.653 10
        17.255 48
        20.787 64
        25.783 96
        28.885 127
        31.150 131
        33.036 142
        36.191 164
        1:10.071 188
        -----------
        Exit (time spaces):
        1:15.022 238
    
    But any Mazetool output can be used.

  1. Study the script carefully and modify it and re-run it as needed to gain a deep understanding of how it works.

  2. For your analysis, copy, extend and modify it to provide the data you need.

Time in Seconds

Note that the Mazetool output is in seconds only (e.g., 36.191) when under 1 minute, but when over one minute contains the minutes followed by a colon and the seconds (e.g., 1:10.071). The times can be converted to just seconds in all cases with code similar to:

# Convert Mazetool time format to seconds.
t = row['time']
if ":" in t:  # time went over one minute.
    (m, s) = t.split(':')             # split into minutes and seconds
    seconds = int(m) * 60 + float(s)  # convert to total seconds
else:         # time is less than one minute.
    seconds = t
print(seconds)


Part 1 - Maze Running

From the data files for people that played your maze, select two example runs that illustrate progression through your maze. One should be a "short" run where the player completed the maze quickly, likely with fewer mouse clicks, and one should be a "long" run where the player took longer to complete the maze, perhaps with more mouse clicks.

Analyze the data, providing a time-series chart of distance traveled (in spaces) versus time (in seconds). The distance should be obtained from the mouse click events. Chart trendlines (with lines and points) should be clearly indicated.

[distance-vs-time]

Additional Trend Line

While two separate charts could be shown, better would be to have one chart with two trendlines (as done in the sample chart). In Excel, this may be done by first creating a scatter plot for the first data set (e.g., the "long" run):

"Insert" → "Scatter Plot" → "Scatter Plot with Lines and Markers"

Then, adding th second data set (e.g., the "short" run) to the first graph by: 1) selecting the data and copying (ctrl-c), 2) selecting the chart, 3) Choosing:

"Home" → "Paste" → "Paste special"
making sure "Categories (X Values) in First Column" is selected.


Part 2 - Win Tally

From all the data files for people that played your maze, tally up the number of people that "won" (completed your maze below the time you had specified as a winning time).

Make a pie chart of the number of wins and the number of losses.

[win-tally]

To do this analysis, you will need data in the form:

  Time,
  115.396,
  45.022,
  46.924,
Where each row is the exit time (the time the maze was completed), in seconds.

Once the time csv data is imported into Excel, the wins and losses can be tallied. In a separate column, provide an IF formula that checks if the time is less than the winning time. For example, if the winning time is 30 seconds or under, the formula to check the winning time in cell A1 would be:

  =IF(A1 < 30, "yes", "no")

Copy and paste this formula along the column for each row.

Then, use a COUNTIF formula to count up the "yes" and the "no" values. Something like:

  =COUNTIF(B1:B3, "yes")

For example, a spreadsheet with the analysis of 3 maze times (15.396, 45.022, and 76.924 seconds), might look like:

[Spreadsheet Pie Sample]

A pie chart can be made by selecting the "yes" and "no" rows and the tally column (e.g., C1:C2 to D1:D2 in the example) and then:

"Insert" → "Insert Pie" → "2-D Pie"


Part 3 - Maze Time

Analyze the completion times for all the people that played your maze. In addition, analyze and compare the distribution times of your maze to the distribution times of all other mazes (i.e., do not include your maze data in the "all" maze data).

The time data is the same as in part 2.

  Time,
  115.396,
  45.022,
  46.924,
  ...
But for this part, draw a cumulative distribution chart of the times.

[times-distribution]

To create a cumulative distribution chart:

  1. Sort the data from low to high.
  2. Compute the percent in an adjacent column with =row()/count * 100 where count is the number of data rows.
  3. Select both columns (data and percent).
  4. Create a chart:
    "Insert" → "Scatter Plot" → "Scatter Plot with Lines and Markers"

A trend line for the second distribution (e.g., "all") can be added by the same method used in part 1 to add a trend line to a time series chart.

Compare observed times for your maze with the "best" times you determined in Part 0.


Part 4 - Maze Analysis

Do some comparative analytics on individual mazes, including your own, compared to all mazes.

First, for all mazes (including your own), find:

  1. Maximum number of clicks made (for a single player)
  2. Maximum time to complete (for a single player)
  3. Maximum number of spaces traveled (for a single player)

Report the results in a table.

For the variables of clicks, time and spaces, draw radar charts comparing your maze average clicks, average time and average spaces to the average clicks, average time and average spaces for all mazes.

In order to draw a radar chart in Excel, data needs to be in the form:

  Clicks, Spaces, Time,
      7,    128,   72,

Once in Excel, select the both rows and all three columns and:

"Insert" → "Other" → "Radar scatter plot"

In doing so, however, the comparative charts will not be very effective! This is because the number of clicks is typically far fewer than either spaces or time. To remedy this, normalize each dimension each average by dividing by the maximum (previously computed for the table), i.e., clicks / max_clicks. This will produce a number from 0 to 1 that is comparable across variables. Data will be similar to:

  Clicks, Spaces, Time,
   0.428,  0.714, 0.98,

A radar plot drawn with the normalized data will be comparable.

[radar-compare]

Select three other mazes, randomly or based on your own interest (e.g., your friends). Draw similar radar charts for each, being sure to clearly identify which maze each chart came from.

Compare observed average clicks and average spaces for your maze with the "best" times you determined in Part 0.


Hints

The entire Mazetool data set for all players on all mazes is available at:

http://www.cs.wpi.edu/~imgd2905/d17/projects/proj2/mazetool-data.zip

Note, this dataset is updated dynamically as people play mazes. Thus, for your analysis, the day and time the data set was downloaded should be noted since the content changes over time.

For many kinds of analytics, including game analytics, organization is key. Paying special attention to filenames - raw data, scripts and csv data - will pay dividends as the project progresses and gets more complicated. This is especially important if you ever have to re-visit your analysis, something that is quite common in practice.

With this in mind, some suggestions on keeping organized:

  1. Make small, individual scripts that provide data for one part of the needed analysis. For example, a script may just pull out all the completion (exit) times from a series of files. Nothing more. This could be used for Part 1.

  2. Name each script something meaningful that tells something about what it does from the filename. For example, "exit-times.py".

  3. Have any output csv files produced by the script use the same name as the script. For example, "exit-times.csv".

  4. For spreadsheet analysis, have a separate file for each part in the analysis or, alternatively, a separate sheet for each part of the analysis. Name the file (or sheet) with the same name used for the script and the data. For example, "exit-times.xlsx".

  5. Have a brief README.txt file that provides a one-line description of what each script does.

When embedding charts in a report, fonts may often shrink to the point they are not readable! To avoid this, as a guideline, compare the size of text inside a chart to the size of the text in the paper. They should be similar in size. If the chart text size is too small or way too big), go back to the original chart and choose a font size that results in a final font size that more proportional to the paper font. Note, this may require adjusting other aspects of the chart, such as axis tick marks and spacing.

The aspiring Python programmer might want to have an easier way to use the code to compute time in seconds. In general, small pieces of code like this can be separated into a "block" of code, called a function. With a function, you could write something like, for example, seconds = getSeconds(row['time']) to get the number of seconds, regardless of whether the format is pre-pended by minutes or not. Tutorials to make functions can be found online; one such document is:

https://www.tutorialspoint.com/python/python_functions.htm

Many of the grading comments applied to Project 1 are general and pertain to Project 2 as well. You should review the comments made to your Project 1 report and make sure not to incorporate needed changes into your Project 2 report.

You should also check out the Postmortem Feedback on Graded Project 1s for general guidelines that also pertain to this Project 2.

For part 3, depending upon how you are getting the exit times, you may want to run your python script on all the mazetool output files in the class. In Linux, you can do this with:

python3 parse.py mazetool*.csv

Note, the "*" character. This is a "wildcard" that matches any set of characters (and numbers), so "mazetool*.csv" matches all files that begin with "mazetool" and end with ".csv".

If you are using Thonny on Windows, the "Open system shell" opens a Windows command line terminal. This does NOT support wildcard expansion automatically. So, in the example above, you'd need to type all the names in by hand on the command line to run your python script. You can do this.

As an alternative, some additional python code will do the expansion:

# If on Windows, use the below to support wildcard expansion from the
# command line.
import glob

filenames = []
for filename in sys.argv:
    if '*' in filename or '?' in filename or '[' in filename:
        filenames += glob.glob(filename)
    else:
        filenames.append(filename)
sys.argv = filenames

# For example, print the names out.
for i in range(1, len(sys.argv)):
    filename = str(sys.argv[i])
    print (filename)


Writeup

Writeup a short report.

For Part 0 (Level Design, Play and Analysis Tools), include details on your maze, describing the high level experience, showing a screen shot of your maze, and providing data on the "win" condition, shortest path, fewest clicks and fastest time estimates.

For each other part of the project, provide a brief section on the analysis in clearly labeled sections (e.g., Part 1 - Maze Running). Include a brief description of the methodology, particularly as it may relate to the results obtained.

All guidelines for presenting and describing charts should be adhered to.


Submission

The assignment is to be submitted electronically via the Instruct Assist Website by 11:59pm on the day due.

The submission is a report in PDF, named proj2_lastname.pdf

To submit your assignment, log into the Instruct Assist website:

https://ia.wpi.edu/imgd2905/

Use your WPI username and password for access. Visit:

Tools → File Submission

Select "Project 2" from the dropdown and then "Browse" and select the assignment file (i.e., proj2_lastname.pdf).

Make sure to hit "Upload File" after selecting it!

If successful, there should be a line similar to:

 Creator    Upload Time             File Name        Size    Status   Removal
 Claypool 2017-03-30 11:20:17  proj2_claypool.pdf  3008 KB  On Time  Delete


Grading

All accomplishments are shown through the report. The point break down does not necessarily reflect effort or time on task. Rather, the scale is graduated to provide for increasingly more effort required for the same reward (points).

Breakdown

Part 0 10% Building a maze and playing everyone else's maze.
Part 1 35% Time series chart showing short and long maze runs.
Part 2 25% Pie chart showing "win" fraction.
Part 3 20% Cumulative distribution charts of maze times.
Part 4 10% Table of maximums and radar charts comparing mazes.

Rubric

100-90. The submission clearly exceeds requirements. All Parts of the project have been completed or nearly completed. The report is clearly organized and well-written, charts and tables are clearly labeled and described and messages provided about each Part of the analysis.

89-80. The submission meets requirements. Parts 0-3 of the project have been completed or nearly completed, but perhaps not Part 4. The report is organized and well-written, charts and tables are labeled and described and messages provided about most of the analysis.

79-70. The submission barely meets requirements. Parts 0-2 of the project have been completed or nearly completed, and some of Part 3, but not Part 4. The report is semi-organized and semi-well-written, charts and tables are somewhat labeled and described, but parts may be missing. Messages are not always clearly provided for the analysis.

69-60. The project fails to meet requirements in some places. Parts 0-1 of the project has been completed or nearly completed, and some of Part 2, but not Parts 3 or 4. The report is not well-organized nor well-written, charts and tables are not labeled or may be missing. Messages are not always provided for the analysis.

59-0. The project does not meet requirements. Besides Part 0, and maybe Part 1, no other part of the project has been completed. The report is not well-organized nor well-written, charts and tables are not labeled and/or are missing. Messages are not consistently provided for the analysis.

Postmortem Feedback on Graded Projects

The comments below are in response to graded projects. They are not provided in any particular order.


Top | P0 - Design, Play, Tools | P1 - Maze Running | P2 - Win Tally | P3 - Maze Time | P4 - Maze Analysis | Hints | Writeup | Submission | Grading

Return to the IMGD 2905 home page

Questions: imgd2905 question-answer forum