IMGD 2905 - Data Analysis for Game Development
Homework 1
Due: Wednesday, April 8th, 11:59pm
Homework will be turned in online (canvas) in written form, saved as a PDF.
Short answer
What is the purpose of a measure of central tendency?
Which of the following statements about the median is not true?
- It is less affected by extreme values than the mean.
- It is a measure of central tendency.
- It is equal to the range.
- It is equal to the mode in a bell-shaped, "normal" distribution.
In a symmetric distribution:
- The median equals the mean
- The mean is less than the median
- The mean is greater than the median
- The median is less than the mode
Which of the following measures of dispersion depend upon every value in the set of data?
- Range
- Standard deviation
- Both a and b
- Neither a nor b
Consider the set of numbers: 9 1 1 10 7 11 5 8 2
- What is the mean?
- What is the median?
- What is the mode?
- What is the first quartile?
- What is the range?
- The data is: i) Right skewed, ii) Left skewed, iii) Symmetrical
Consider the below histogram:
- What measure of central tendency would you use to describe it and why?
- What measure of variation would you use to describe it and why?
Consider the below scatter plot (Figure 2):
Which statement best describes the relationship between speed and traffic volume shown in the graph?
- As traffic volume increases, vehicle speed decreases.
- As traffic volume increases, vehicle speed increases.
- As traffic volume increases, vehicle speed increases at first, then decreases.
- As traffic volume increases, vehicle speed decreases at first, then increases.
Problems
Use a spreadsheet for the following problems.
Download the data on New York's Winter Mean temperature.
- Create a scatter plot of the data. Be sure to label all axes.
- What conclusion can you reach about the relationship between time and mean temperature in New York?
For professional sports, the cost of attending a professional game is often tracked by the Fan Cost Index: following data represents the cost of four tickets, 6 drinks, four hot dogs, two programs, to caps and the parking fee for one car at the arena for each professional team in a league. Here is the data for a professional eSports league.
- Compute the mean and median.
- Compute the quartiles.
- Compute the variance, standard deviation and range.
- Construct a box and whiskers plot and a histogram. Properly label all axes. Which might you prefer and why?
- Is the data skewed? How can you tell from your graphs in d?
- What would you use for a measure of central tendency? Why?
- Which teams have a particularly high cost index? How can you tell?
- Based on the results a-d, what conclusions can you reach concerning the Fan Cost Index for this league?
Return to the IMGD 2905 home page