# IMGD 2905 - Data Analysis for Game Development

## Homework 3

Due: Thursday, April 28th, 11:59pm

Total points: 11

Homework will be turned in online (canvas) in written form, saved as a PDF.

1. (1 point) The sampling distribution of the mean can be approximated by the normal distribution (select all that apply).

1. As the sample size (number of observations in each sample) gets "large enough".
2. As the size of the population standard deviation increases.
3. As the size of the sample standard deviation decreases.
4. For symmetric distributions, if samples of at least 15 observations are selected.
5. For distributions where the mean equals the median.
2. (1 point) For samples of N=3, the sampling distribution of the mean will be normally distributed (check all that apply):

1. Regardless of the shape of the population.
2. If the population is normally distributed.
3. If the shape of the population is skewed.
4. If the standard deviation of the mean is 3.
5. If the standard deviation of the mean is less than 3.
3. (1 point) If a particular set of data is normally distributed, you would find that approximately (check all that apply):

1. 2 of every 3 observations would fall within 1 standard deviation of the mean.
2. 19 of every 20 observations would fall within 2 standard deviations of the mean.
3. The standard error would be smaller for 3 observations than it would be for 20 observations.
4. The more observations you took, the lower the sample standard deviation would be.
4. (1 point) The size (magnitude) of a confidence interval depends upon (check all that apply):

1. The number of observations in a sample (N).
2. The significance (alpha) / confidence selected.
3. The mean of the population.
4. The standard deviation of the population.
5. (1 point) Which of the following is true (check all that apply):

1. You can construct a finite 100% confidence interval for an estimate of the population mean.
2. Usually, the population mean is the unknown value that is to be estimated.
3. The significance (alpha) is the proportion in the tails of the distribution that is outside the confidence interval.
4. The significance (alpha) is the proportion in the tails of the distribution that is inside the confidence interval.
6. (1 point) As a WPI admissions intern, you are tasked with estimating the number of admitted students (class of '26) that will be IMGD majors. You sample 300 admitted students to WPI and find that 45 of them are planning on being IMGD majors. The 95% confidence interval for the fraction of incoming students planning on being IMGD majors is 0.15 +- 0.04. Interpret this interval.

1. You are 95% confident that between 11% and 19% of the sampled students will be IMGD majors.
2. You are 95% confident that 15% of the incoming students will be IMGD majors.
3. You are 95% confident that the true percentage of incoming students that will be IMGD majors is between 11% and 19%.
4. There is a 95% chance of selecting a sample that finds that between 11% and 19% of the incoming students will be IMGD majors.

### Problems

1. (1 point) To avoid crowded grocery stores during the pandemic, your family is using home delivery. You pick a random sample of a typical basket of good and compute the price that each vendor would charge for the same sample. The list is below. Compute a 95% confidence interval estimate of the mean price of a basket of goods for home delivery. Assume the underlying prices for baskets of goods across all vendors follows a normal distribution.
\$72.95 Instacart
\$85.13 Amazon Fresh
\$85.85 Fresh Direct
\$92.13 Shipt
\$72.70 Thrive Market
\$82.19 Hungry Hippo
\$72.57 Peapod
1. (4 points) You built a racing game, Goat Runner, where players ride goats around alumni track.

1. You sample the time around the track for 30 players and find the mean is about 15.71 seconds with a standard deviation of 4.63 seconds. Find a 90% confidence interval for the population mean time round this track.

2. You decide the game is too easy, and add a few hurdles the goats have to jump over. A sample of another 30 players shows the mean time around the track is now 18.90 seconds with a standard deviation of of 7.1 seconds. Find a 90% confidence interval for the population mean time round this track.

3. Draw a column chart with data from #a and #b, depicting the 90% confidence interval.

4. Interpret the chart, including using the confidence intervals, in your comparisons of the two tracks.