College

**A Tale of Two Rules**

**Background:**
How consistent were our rules for the contest in determining a winner? Statisticians and Data Scientists use rules, like mean squared error (MSE), to determine the accuracy of their predictions. The MSE rule calculates the score by finding the average of the squared differences between the prediction and the actual values.

**Instructions:**
For each of the testing datasets below, calculate the mean squared error and determine which prediction is best.

**Dataset A:**

Heights from Dataset A: 70.1, 61, 70.1, 68.1, 63, 66.1, 61, 70.1, 72.8, 70.9

| | Team A | Team B | Team C |
|-----------------|--------|--------|--------|
| Minimum | 59.1 | | |
| 1st Quartile | 65 | | |
| Mean | 67.9 | | |
| Median | 68.1 | | |
| 3rd Quartile | 70.9 | | |
| Maximum | 76 | | |
| MSE | 22.05 | | |

**Which team/statistic made the best prediction using MSE?**

**Dataset B:**

Heights from Dataset B: 70.1, 72, 68.9, 61.8, 70.9, 59.8, 72, 65, 66.1, 68.9

| | Team A | Team B | Team C |
|-----------------|--------|--------|--------|
| Minimum | 59.1 | | |
| 1st Quartile | 65 | | |
| Mean | 67.9 | | |
| Median | 68.1 | | |
| 3rd Quartile | 70.9 | | |
| Maximum | | | |
| MSE | | | |

**Which team/statistic made the best prediction using MSE?**

Using the mean squared errors, which team/statistic made the best prediction for both testing datasets?

Answer :

Let's break down how to determine which prediction was the best for each dataset by calculating the Mean Squared Error (MSE).

### Dataset A

1. Given Heights: 70.1, 61, 70.1, 68.1, 63, 66.1, 61, 70.1, 72.8, 70.9.

2. Mean (Team A): 67.9.

3. Calculate MSE for Mean in Dataset A:

- For each height in the dataset, compute the squared difference from the mean.
- Add up all these squared differences.
- Divide the sum by the number of data points (10 in this case) to find the MSE.

4. Result for MSE: The MSE for the mean is 17.004.

5. Determine Best Prediction:

- From the given data, Team A is associated with an MSE of 22.05.
- Since 17.004 (Mean's MSE) is less than 22.05, the Mean prediction is better.

### Dataset B

1. Given Heights: 70.1, 72, 68.9, 61.8, 70.9, 59.8, 72, 65, 66.1, 68.9.

2. Mean (Team A): 67.9.

3. Calculate MSE for Mean in Dataset B:

- Similar to Dataset A, compute the squared differences between each height and the mean.
- Sum those squared differences and divide by the number of data points (10).

4. Result for MSE: The MSE for the mean is 16.393.

5. Determine Best Prediction:

- In Dataset B, without other comparisons provided, the Mean's prediction is the default choice.

### Final Conclusion

For both datasets, the Mean value provided the best prediction with the lowest Mean Squared Error.

- Dataset A: Mean's prediction was best with an MSE of 17.004.
- Dataset B: Mean's prediction was considered the best since no other team MSEs were given for comparison, with an MSE of 16.393.

The Mean prediction was consistent in performing best in both datasets.