QMS 202 Test 2: Hypothesis Testing and Statistical Analysis

QMS 202 Test 2: Statistical Analysis Crib Sheet

By: Eron Gjongecaj, Student nr: 50053246

Chapter 12: Comparing Means and Proportions

Topics: Comparing means of two related populations, comparing the proportions of two independent populations.

Comparing Means Between Two Related Populations: Paired T-Test

Paired t-test for the mean difference: Population needs to be normally distributed. If not, we can still use the paired t-test if the sample size is not very small.

  1. State the null and alternative hypotheses. Define the null and alternative hypotheses we want to test.
  2. Specify the desired level of significance and the sample size. Example: α = 0.01 & n = 5. Therefore, df = 5 − 1 = 4.
  3. Determine the appropriate technique: T-Test
  4. Determine the critical value (is it a two-tailed test?). Select STAT, F5 (DISTR), F2(t) and then F3(Invt). Now in the Inverse Student-t menu.
  5. Enter the score differences into List 1.
  6. Select STAT, F3 (TEST), F2(t) and then F1(1-S). Now select in the 1-Sample tTest.
  7. Statistical Decision: Since p-value > α, we do not reject the null hypothesis H0 at α = 0.01.
  8. Conclusion: The evidence (does not or does depending on the question) indicate that there is a significant change in the number of complaints.

If u1 is used for the population mean of the exam scores before the course, u2 for the population mean of the exam scores after the course, and ud= u1-u2, state the alternative hypothesis for the test is ud<>0.

Comparing the Proportions of Two Independent Populations

There are two methods for testing the difference between two proportions selected from independent populations:

  • Z-test for the difference between two proportions.
  • χ2 test for the difference between two proportions.

We have two populations: Population 1 and Population 2. Data collected in both samples are numerical. We take a random sample of size n1 from Population 1 and a random sample of size n2 from Population 2. The number of items of interest in Sample 1 is denoted by X1, and the number of items of interest in Sample 2 is denoted by X2. The proportion of items of interest in Sample 1 is p1 = X1/n1, and the proportion of items of interest in Sample 2 is p2 = X2/n2. The proportion of items of interest in Population 1 is π1, and the proportion of items of interest in Population 2 is π2.

Example question: Is there a significant difference between the proportion of men and the proportion of women who will vote Yes on Proposition A? In a random sample, 36 of 72 men and 35 of 50 women indicated they would vote Yes. Test at the 0.05 level of significance.

Z-Test for the Difference Between Two Proportions
  1. State the null and alternative hypotheses. The null and alternative hypotheses we want to test are H0 : π1 − π2 = 0, H1 : π1 − π2 ≠ 0. H0 : π1 − π2 = 0 means that there is no difference between proportions of men and the proportion of women who will vote Yes on Proposition A. H1 : π1 − π2 ≠ 0 means that there is a significant difference between proportions of men and the proportion of women who will vote Yes on Proposition A.
  2. Specify the desired level of significance and compute the proportions. We have α = 0.05, p1 = X1/n1 = 36/72 = 0.5 for men, and p2 = X2/n2 = 35/50 = 0.7 for women.
  3. Determine the appropriate technique. This is a Z-Test for the Difference Between Two Proportions.
  4. Determine the critical value. The critical values Z α/2 for α = 0.05 can be found using the calculator: Select STAT, F5 (DISTR), F1(NORM) and then F3(Invt). Now in the Inverse Normal, Data: F2 (Variable), Tail: F3 (CNTR) ,Area: 0.95 EXE , σ: 1, µ: 0.
  5. Select STAT, F3 (TEST), F1(Z) and then F4(2-P). Now select in the 2-Prop ZTest the following options: p1: F1 (not equal p2), x1: 36, n1: 72, x2: 35, n2: 50.
  6. Statistical Decision: Since p − value = 0.027 < α = 0.05, we reject the null hypothesis.
  7. Conclusion: There is evidence of a difference in proportions who will vote yes between men and women.
  1. Select STAT, F4 (INTR), F1(Z) and then F4(2-P). Now select in the 2-Prop ZInterval the following options:
  2. C-Level: 0.99 EXE
  3. x1: 136 EXE
  4. n1: 240 EXE
  5. x2: 224 EXE
  6. n2: 260 EXE

2-Prop ZInterval: Left = −0.3940313, Right = −0.1957122, p̂1 = 0.56666666, p̂2 = 0.86153846, n1 = 240, n2 = 260. Conclusion: We are 99% confident that the difference in the proportions of males and females who enjoy shopping for clothing is between −0.3940313 and −0.1957122.

Chapter 13: One-Way ANOVA

We will learn how to use hypothesis testing for comparing the difference between the means of several (more than two) populations. The one-way ANOVA is an extension of the t-test for the difference between two means.

Assumptions:

  1. Populations are normally distributed.
  2. Populations have equal variances.
  3. Samples are randomly and independently drawn.

When c ≥ 3, the null hypothesis of no differences in the population means is H0 : µ1 = µ2 = · · · = µc

  1. All population means are equal.
  2. No factor effect (no variation in means among groups).

H1 : Not all of the population means are the same:

  1. At least one population mean is different.
  2. i.e., there is a factor effect.
  3. Does not mean that all population means are different (some pairs may be the same).

Total variation can be split into two parts: SST = SSA + SSW where:

  1. SST = Total Sum of Squares also known as (Total variation) or = the aggregate variation of the individual data values across the various factor levels.
  2. SSA = Sum of Squares Among Groups also known as (Among-group variation) or variation among the factor sample means.
  3. SSW = Sum of Squares Within Groups also known as (Within-group variation) or variation that exists among the data values within a particular factor level.

Equations: SST, SSA, SSW, MSA = SSA /(c − 1) , MSW = SSW /( n − c), MST = SST /(n − 1)

where: c = number of groups or levels, = sample size from group, = sample mean from group, = grand mean (mean of all data values)

SSA = n1(X1 − X) + n2(X2 − X) + · · · + nc (Xc − X)

SSW = (X11 − X1) + (X21 − X1) + (X31 − X1) + (X12 − X2) + (X22 − X2) + (X32 − X2) + (X42 − X2)+ (X13 − X3) + (X23 − X3) + (X33 − X3) + (X43 − X3)

One-Way ANOVA F Test

You randomly select five measurements from trials on an automated driving machine for each club.

  1. State the null and alternative hypotheses. The null and alternative hypotheses we want to test are H0 : µ1 = µ2 = µ3, H1 : not all µj are equal.
  2. Specify the desired level of significance, sample sizes, and degrees of freedom. We have α = 0.05, n1 = n2 = n3 = 5 and c = 3. So, n = n1 + n2 + n3 = 15. Finally, the degrees of freedom are c − 1 = 2 for SSA and n − c = 12 for SSW.
  3. Determine the appropriate technique. This is an F test with c − 1 = 2 and n − c = 12 degrees of freedom.
  4. Determine the critical value. The critical value Fα of the F distribution with α = 0.05 and the numerator degrees of freedom df = 2, the denominator degrees of freedom df = 12 can be found using the calculator. Select STAT, F5 (DISTR), F4(F) and then F3(Invt).

Enter data into List 1 and List 2 with the factor in List 1 and Dependent in List 2.

  1. Select STAT, F3 (TEST), and then F5(ANOV). Now select in the ANOVA menu the following options:
  2. How Many: F1 (1)
  3. Factor A: F1 (List1 EXE)
  4. Dependnt: F1 (ist2 EXE).
  5. Statistical Decision: Since P < α, we reject the null hypothesis H0 at α = 0.05.
  6. Conclusion: There is evidence that at least one µj differs from the rest.

In order to use the ANOVA F test, we have to make certain assumptions:

  1. Randomness and Independence. Select random samples from the c groups (or randomly assign the levels).
  2. Normality. The sample values for each group are from a normal population. ANOVA F-test is fairly robust against ”not normal populations”, especially for large samples.
  3. Homogeneity of Variance.

Chapter 14: Chi-Square Test for Contingency Tables

χ2 Test for the Difference Between Two Proportions

Left-Handed vs. Gender. We have two variables and each variable has 2 categories:

  • Dominant Hand: Left vs. Right
  • Gender: Male vs. Female

We examined a sample of 300 children. Out of 120 Females, 12 were left-handed, and out of 180 Males, 24 were left-handed. This can be summarized in the 2 × 2 Contingency Table.

To test the Difference Between Two Proportions, we use the null and alternative hypotheses: H0 : π1 = π2, H1 : π1 ≠ π2. Using our example, if we want to test the difference between the proportions of left-handed females and males, we would have:

  • H0 : π1 = π2 means the proportion of females who are left-handed is equal to the proportion of males who are left-handed.
  • H0 : π1 ≠ π2 means the two proportions are not the same, i.e., hand preference is not independent of gender.

If H0 is true, then the proportion of left-handed females should be the same as the proportion of left-handed males.

Finding Critical value: Select STAT, F5 (DISTR), F3(CHI) and then F3(InvC).

χ2 Test: Select STAT, F3 (TEST), F3(CHI) and then F2(2WAY). Input the data given to you on the observed: Mat A (remember change Dimensions (2×2), Then input the dimensions on Mat B. Go back to the main menu and execute.

Statistical Decision: Since p-value > α = 0.05, we do not reject the null hypothesis H0 at α = 0.05.

Conclusion: There is not sufficient evidence that the two proportions are different at α = 0.05. In other words, there is not sufficient evidence that the proportions of females with dominant left hand and males with dominant left hand are different at α = 0.05.

Test for the Differences Among More Than Two Proportions

Now, we have c ≥ 3 independent groups 1, 2, · · · , c with simple sizes n1, n2, · · · , nc respectively. To test the difference between c ≥ 3 (more than two) proportions, we use the null and alternative hypotheses: H0 : π1 = π2 = · · · = πc, H1 : not all πj are equal for j = 1, 2, · · · , c.

In general, for the m × n contingency table, the χ2 test statistic χ2 STAT has (m − 1) × (n − 1) degree of freedom.

For a given level of significance α, we have the following decision rule, χ2 α is the critical value for the χ2 distribution and can be found using the Casio calculator.

To Use this test is exactly the same as Test for two proportions, the only difference is that dimensions will change.

Statistical Decision: Since p-value is smaller than alfa then we reject the null hypothesis.

Conclusion: There is sufficient evidence that the three proportions are different at α = 0.01. In other words, there is sufficient evidence that the groups have a different attitude at α = 0.01.

χ2 Test of Independence

For a contingency table that has r rows and c columns, we will generalize the χ2 Test for proportions to test of independence for two categorical variables.

  1. For the χ2 test of independence, the null and alternative hypotheses are: H0 : The two categorical variables are independent, meaning there is no relationship between them. H1 : The two categorical variables are dependent, meaning there is a relationship between them.

Critical Values: Select STAT, F5 (DISTR), F3(CHI) and then F3(InvC).

Independent test: Select STAT, F3 (TEST), F3(CHI) and then F2(2WAY). Same way as the other 2 but change the dimensions. Test statistic cannot be negative.

Statistical Decision: Since p-value = 0.99428731 > α = 0.05, we do not reject the null hypothesis H0 at α = 0.05.

Conclusion: There is not sufficient evidence that meal plan and class standing are related at α = 0.05. What does the big p-value = 0.99428731 mean? It means that the probability of getting a test statistic more extreme than 0.709, given that H0 is true is very high! Almost It does not give us any reason to conclude that 2 variables are dependent.

What is the expected frequency fe corresponding to the observed frequency fo=36? means find the corresponding number to the matrix B.