IMGD 2905 - Project 1

The goal of this project is to set up tools for a game analytics pipeline and apply the pipeline to Riot's League of Legends (League). You will work through steps to setup tools that allow for querying, extraction and analysis of data. The pipeline will be exercised from basic queries to Riot's game data set, analysis and presentation through charts and tables, and into a report for dissemination. The tool pipeline will be used for subsequent projects, including a more advanced analysis of League data.

Part 0 - Setup

Part 1 - Matches Played

For the League player (summoner) named "Faker" (Wiki), analyze the cumulative number of (ranked) LoL matches he has played since he started (using the start day as day 0).

To gather the needed data, use Python and the Riot API. Below is a Python script showing some basic queries. The lines with a # are comments, written to help understand what the code is doing. You are encouraged to: 1) run the below script to make sure it works, 2) study it carefully and modify it and re-run it as needed to gain a deep understanding of how it works, 3) copy, extend and modify it to provide the data you need for this project.

#!/usr/bin/python3
#
# basic.py - Do some basic queries using the Riot API.
#
# version 1.2
#

# Bring in Python imports needed for data processing.
# RiotWatcher from: https://github.com/pseudonym117/Riot-Watcher
from riotwatcher import RiotWatcher
import json
import time

# Replace below with my Riot developer key.
developer_key = 'my-developer-key-here'

# Get master RiotWatcher object that queries Riot API using my key.
r_w = RiotWatcher(developer_key)

# Get player with summoner name 'faker'.
# Wiki: https://en.wikipedia.org/wiki/Faker_(video_gamer)
player = r_w.get_summoner(name='faker')

# Print out player info.
print("Player")

# Get player's match list from match data.
match_data = r_w.get_match_list(player['id'])
match_list = match_data['matches']

# Loop through all matches, printing out champion id.
# Note: the champion id corresponds to the static_get_champion_list().
print("\nChampions played") # "\n" puts out a newline (blank line).
count = 0
for match in match_list:
    count = count + 1  # tally the number matches
    champion = match["champion"]
    print(champion, end=", ")  # end=", " puts a comma and space after
print("\nTotal matches: %d" % count)    # %d is for integer

# Print out time of oldest (and last) match in list.
# Note: match times are in milliseconds since 1970.
i = len(match_list)-1 # in Python, the last item in a list is length-1.
match = match_list[i]
match_time_old = match['timestamp']
print("\n\nOldest match time: ", end="")
print(json.dumps(match_time_old, indent=3))
print(time.strftime('%Y-%m-%d %H:%M:%S', time.gmtime(match_time_old/1000)))

# Print out time of newest (and first) match in list.
i = 0  # in Python, the first item in a list is '0'.
match = match_list[i]
match_time_new = match['timestamp']
print("Newest match time: ", end="")  # end="" means don't add a newline
print(json.dumps(match_time_new, indent=3))
print(time.strftime('%Y-%m-%d %H:%M:%S', time.gmtime(match_time_new/1000)))

# Compute time elapsed between newest and oldest.
milliseconds = match_time_new - match_time_old
minutes = milliseconds / (1000 * 60)
hours = minutes / 60
days = hours / 24
years = days / 365

print("\nTime difference between newest and oldest")
print("hours: %d" % hours)    # %d is for integer
print("days: %d" % days)      # %d is for  integer
print("years: %.2f" % years)  # %.2f is for real, 2 digits after decimal

# Write some data to file with commas (i.e., a csv)
with open('basic.csv', 'w') as csvfile:
    print ("Hours, Days, Years", file=csvfile)
    print ("%d, %d, %d" % (hours, days, years), file=csvfile)
# Note, file closes automatically here.

To draw the chart of "Games Played" versus "Time", use a spreadsheet (e.g., Microsoft Excel). Data needs to be in a format similar to:

Note! The values extracted (and that you print out) may be in reverse chronological order (i.e., newest to oldest). This can be fine for generating a chart - you do not need to reverse them.

The values are aligned in vertical columns, with each column separated by a comma (,). This file format is known as "csv" for "comma separated values" and can be read into most spreadsheets, neatly placing the columns and rows into spreadsheet cells.

Part 2 - Champions Played

Note, there are many ways to compute this! However, a recommended way is to list all the champion id's and compute the mode() in Excel. Tips for how to compute the mode are at:

For the final answer, the Champion must be reported by name (e.g., "Leona") and not by id (e.g., 89). To find the champion name associated with a champion id, refer to the "hello.py" script from the setup. The JSON object returned by static_get_champion_list() has an id field corresponding to the ids found from match['champion']. Think about how to print out the JSON returned by static_get_champion_list().

Repeat the above analysis, but do so for two separate groups of matches - the oldest half of Faker's matches and the newest half of the Faker's matches. Determine if the mode changed (i.e., whether Faker changed his main champion half way through is career).

Part 3 - Compare Players

Pick another League player of choice (note, s/he must have played in competitive/ranked LoL matches in order to gather data) and do the same analysis of matches and champions played as that done for Faker.

There are many ways to find competitive/professional League players, but a pretty easy way is through a Google search. Or, any friends that have played competitive/ranked League matches can be used.

Draw a chart comparing your selected player to Faker (note, this means a chart that shows both data sets on one chart).

Provide a combined table with your selected player's champion played and with Faker's champions played.

Part 4 - Match Data

For the first analysis, produce a histogram of the number of matches versus the match length, broken into 5 minute intervals (i.e., the "bucket size" is 5 minutes), something like:

The match data can be obtained from the RiotWatcher call to get_match(), called with a match id (a number). In the case of this project, the matchId is one of the matches for Faker. For example, a code snippet may look like:

For the second analysis, compute summary statistics on the match duration - the minimum, maximum, average and standard deviation of the match length. Present this result in a table.

Remember, the independent variable on the x-axis (horizontal axis) and the dependent variable on the y-axis (vertical axis). The independent variable is the one that you manipulate, and the dependent variable is the one that you observe. Note that sometimes you do not really manipulate either variable, you observe them both. In that case, if you are testing the hypothesis that changes in one variable cause (or at least correlate with) changes in the other. Put the variable that you think causes the changes on the x-axis.

Select "Project 1" from the dropdown and then "Browse" and select the assignment file (i.e., proj1-lastname.zip).

Part 1	60%	The analysis of matches played represents more than half of the grade. Completing this part means your tool pipeline is setup and can be used, with a basic demonstration of one full-set of analysis.
Part 2	20%	The analysis of the champions is worth an additional 20%. Completing this demonstrates associating data from one script/table with another (that of `hello.py`), a common skill needed for data analytics.
Part 3	13%	Comparing players is worth an additional letter grade worth of points. Doing so reinforces the skills already demonstrated one time.
Part 4	7%	Analytics match data is worth a small fraction of the grade as it represents the "icing on the cake". Analyzing the match data shows an additional set of queries as well as analysis of a new data structure.

Postmortem Feedback on Graded Projects

The comments below are in response to graded projects. They are not provided in any particular order.

IMGD 2905 Project 1

League of Legends Player Analytics

Part 0 - Setup

Part 1 - Matches Played

Part 2 - Champions Played

Part 3 - Compare Players

Part 4 - Match Data

Writeup

Hints

Submission

Grading

Breakdown

Rubric

Postmortem Feedback on Graded Projects