# IMGD 2905 - Data Analysis for Game Development

## Homework 1

Due: Monday, March 28th, 11:59pm Tuesday, March 28th, 11:59pm

Homework will be turned in online (canvas) in written form, saved as a PDF.

Total points: 48

1. (1 point) Statistical inference occurs when you:

1. compute descriptive statistics from a sample
2. take the results of a sample and reach conclusions about a population
3. present a graph of data
4. take a complete census of a population
2. (2 points) A professor wants to study the number of hours per week that the students in her class spend playing games. She selects 20 students from the class of 100 students and gives them a survey.

All of the students in the class constitute the _________.

1. sample
2. population
3. statistic
4. parameter

The 20 students participating in the survey constitute the _______.

1. sample
2. population
3. statistic
4. parameter
3. (1 point) The methods of collecting, presenting and computing characteristics of a set of data in order to describe data features are called:

1. descriptive statistics
2. sampling
3. the scientific method
4. statistical inference
4. (2 points) Supposed an IQP wants to study the average Overwatch rank for WPI undergraduate students compared to their class year (first-year through senior year).

The observed ranks for each student is an example of

1. an independent variable
2. a dependent variable
3. a parameter
4. a sample

The observed class year for each student is an example of

1. an independent variable
2. a dependent variable
3. a parameter
4. a sample
5. (1 point) What is the purpose of a measure of central tendency?

6. (1 point) Which of the following statements about the median is not true?

1. It is less affected by extreme values than the mean.
2. It gets larger as the standard deviation gets larger.
3. It is a measure of central tendency.
4. It is equal to the mode in a bell-shaped, "normal" distribution.
7. (1 point) In a symmetric distribution:

1. The mean equals the median.
2. The mean is less than the median.
3. The mean is greater than the median.
4. The median is less than the mode.
8. (1 point) Which of the following measures of dispersion depend upon every value in the set of data?

1. Standard deviation
2. Range
3. Both a and b
4. Neither a nor b
9. (5 points) Consider the set of numbers: `9 1 1 10 7 11 5 8 2`

1. What is the mean?

2. What is the median?

3. What is the mode?

4. What is the first quartile?

5. What is the range?

10. Consider the below histogram:

1. (3 points) What measure of central tendency would you use to describe it and why?

2. (2 points) What measure of variation would you use to describe it and why?

11. (1 point) Consider the below scatter plot (Figure 2) of the measured carbon dioxide (CO2) in atmosphere at the Mauna Loa Observatory on the big island of Hawaii:

Which statement(s) best describe the relationship between carbon dioxide and time (more than one may apply)?

1. From 1974-1985, the CO2 content increased.
2. The CO2 content is cyclic year to year.
3. The CO2 content increases every month compared to the previous month.
4. The CO2 content shows an overall linear trend.
5. The CO2 content shows an overall exponential.

### Problems

Use a spreadsheet for the following problems.

1. Download the data on Fuel Economy of Select Cars.

1. (3 points) Create a scatter plot of the data. Be sure to label all axes.

2. (3 points) What inferences can you reach about the relationship fuel economy and engine size?

2. For professional sports, the cost of attending a professional game is often tracked by the Fan Cost Index: following data represents the cost of four tickets, 6 drinks, four hot dogs, two programs, to caps and the parking fee for one car at the arena for each professional team in a league. Here is the data for a professional eSports league.

1. (2 points) Compute the mean and median.

2. (2 points) Compute the quartiles.

3. (3 points) Compute the variance, standard deviation and range.

4. (3 points) Construct a box and whiskers plot and a histogram. Properly label all axes. Which might you prefer and why?

5. (2 points) Is the data skewed? How can you tell from your graphs in d?

6. (3 points) What would you use for a measure of central tendency? Why?

7. (2 points) Which teams have a particularly high cost index? How can you tell?

8. (2 points) Based on the results a-d, what conclusions can you reach concerning the Fan Cost Index for this league?

9. (2 points) The Cranes have a posh stadium but are still a bargain. What is their Z-score?