Understanding Key Statistical Concepts and Theorems
Law of Large Numbers
If you take samples of larger and larger size from any population, then the mean (x̄) of the sample tends to get closer and closer to μ (the population mean).
Sampling Distribution
The sampling distribution of the mean approaches a normal distribution as n (the sample size) increases.
Central Limit Theorem
The larger the sample size, the more normal the distribution will be.
Standard Error
The standard error is the standard deviation of the distribution of the sample means. T-distributions have more variability and a flatter, more spread-out shape.
z = (x – μ) / σ
Standard error: σx̄
Sample size should be 30 or greater.
Standard error formula: z = (x – μ) / (σ / √n)
Example: Patient Recovery Time
The patient recovery time from a particular surgical procedure is normally distributed with a mean of 5.3 days and a standard deviation of 2.1 days. The 90th percentile is approximately 7.99 days.
Example: IQ Scores
IQ is normally distributed with a mean of 100 and a standard deviation of 15. Let X = IQ of an individual. X ~ N(100, 15)
Z = (120 – 100) / 15 = 1.33
The probability that a person has an IQ greater than 120 is approximately 0.0918.
Example: Middle 50% of IQs
The middle 50% of IQs fall between what two values?
- X = 100 + 0.67(15) = 110.05
- X = 100 – 0.67(15) = 89.95
Example: Sample Mean Probability
For a sample of n = 25 scores, what is the probability that the sample mean will be within 5 points of the population mean? In other words, what is p(95 < X < 105)?
- Z = (M – μ) / √(σ2 / n)
- Z for 95: (95 – 100) / (15 / √25) = -5 / 3 = -1.67
- Z for 105: (105 – 100) / (15 / √25) = 5 / 3 = 1.67
- 1. 67 corresponds to 0.45254; 2 * 0.45254 = 0.9030
Example: NBA Player Heights
The heights of the 430 National Basketball Association players were listed on team rosters at the start of the 2005–2006 season. The heights of basketball players have an approximate normal distribution with mean (μ) = 79 inches and a standard deviation (σ) = 3.89 inches.
For a height of 77 inches: z = (77 – 79) / 3.89 = -0.51
Transforming Scores
Formula: μ + zσ
Example: μ = 450, σ = 60
- Scorch Jones: 520
- Singe Johnson: 392
Standard scores: μ = 100, σ = 10
Scorch: (520 – 450) / 60 = 1.166
100 + 1.166(10) = 111.66
Example: Hypothesis Testing
Sample size: 36, SD: 5, Mean: 100; Population: 1,000, Mean: 99, SD: 2.5
- State your null and alternative hypotheses:
- H0: X = 99
- H1: X ≠ 99
- Set your decision criterion (alpha level, one- or two-tailed): Two-tailed
- Compute the statistic (Z-test):
Z = (100 – 99) / (2.5 / √36) = 2.4
We would reject the null hypothesis because the value of 2.4 is higher than the critical region of 1.96 (for α = 0.05).
Regions
- Retention region: Null hypothesis is true / no effect.
- Rejection region: Unlikely that the null hypothesis is true; reject the null hypothesis; there is an effect.
Error Types
- Type I: Reject the null hypothesis when it is true.
- Type II: Fail to reject the null hypothesis when it is false.
Degrees of Freedom
The more degrees of freedom we have, the closer we get to the actual number.
Hypothesis Testing Differences
The difference in the hypothesis test is that the critical region changes, and the normal curve uses +/- 1.96 (for α = 0.05).
Standard Deviation vs. Standard Error
Sample standard deviation is the average distance of all data points from the mean in the population. The standard error is the average distance of all the sample means (of size n) from the population mean.
Example: t-test
- H0: μphysical education test = 12
- H1: μphysical education test ≠ 12
- Two-tailed, α = 0.05
- df = 25 – 1 = 24
- Critical region: ±2.064
- t = (15 – 12) / (29 / √25) = 0.517
- t(24) = 0.517, p > 0.05
We fail to reject the null hypothesis because there is no significant effect of the P.E. programs on pushup scores. On average, students who are in P.E. programs do not perform significantly better. M = 15, SD = 1.67
t-test Formula
t = (x – μ) / sx, where sx = s / √n
We use a t-test when we do not know the population standard deviation.
s2 = SS / (n – 1), s = √(SS / (n – 1))