Key Statistical Concepts and Hypothesis Testing
Multiple Choice Section
- If p is greater than alpha in a hypothesis test, do not reject the null hypothesis.
- The form of a relationship determines the effect of the independent variable (IV) on the dependent variable (DV).
- Degrees of freedom for chi-square: (rows – 1) x (columns – 1)
- ANOVA compares the means of more than 3 groups. A t-test compares 2 groups.
- The direction of a relationship between two interval/ratio variables depends on the sign of r and b.
- The significance of a relationship determines its sample vs. population probability.
- Type I error: Wrongly rejecting a null hypothesis.
- Confidence Interval: (Point Estimate) +/- (MoE or Error Term)
- Research Hypothesis: A conjectural statement that specifies a directional relationship between two variables.
- Chi-square (X2) increases as the difference between the observed and expected frequencies increases. A Chi-square test aims to test how well the pattern of observed frequencies fits some expected pattern of frequencies.
- The relationship between the IV and DV on the regression line is called the slope.
- Parsimony is the idea that the simplest solution is best. Also called Occam’s Razor.
- The regression line is the line of best fit in X vs. Y.
- The Law of Large Numbers: The larger the sample size, the smaller the standard error.
- Expected Value: [(Total Column) x (Total Row)] / Total All
Essay Terms
Lambda: Lambda is a proportional reduction of error (PRE) measure of association. It tells us the proportion of variation in the values of the dependent variable that can be explained when we know the values of the independent variable. It is used to determine the degree of association of cross-tabulations when one or both of the variables are measured at the nominal level. We should not use Lambda because when we have the same modal category of the DV showing up across all categories of the IV, Lambda will always equal 0. As a result, we use Cramer’s V or Phi to measure association for nominal-level data, which is based on the X2 distribution.
Chi-Square: Chi-Square assumes that the researcher has hypothesized a relationship in advance. Chi-Square assumes that the sample was selected randomly. Chi-Square assumes that no more than 25 percent of the cells have an expected frequency of less than five. The larger the number of cases, the larger Chi-Square will be since the adjustment for sample size is only partial. This is as it should be since a larger sample reduces the risk of Type I error. This means that Chi-Square should NEVER be used to draw conclusions about the strength of the relationship between IV and DV (since trivial relationships will attain statistical significance if the sample is large enough). A non-significant Chi-Square does NOT mean that our sample is unrepresentative. What it usually means is that the relationship we have observed is so weak that it could easily have occurred by chance.
Type II Error: A Type II Error is failing to reject the null hypothesis when it should have been rejected – or claiming that no relationship exists based on observations from the sample, when in fact there is a relationship between the variables in the population. To overcome a Type II Error, you need to collect more samples or increase the sample size.
Gamma: This paragraph should mention that Gamma is a proportionate reduction of error measure that is used to measure the association between two ordinal-level variables. It will over-inflate the amount of association due to not incorporating data on the same row or column as the referenced cell. As a result, we would use a more conservative measure such as TauB or Somer’s D.
Other
Who was “Student”? William Sealy Gosset.
Where did he work? Guinness Brewery in Dublin, Ireland.
What was the statistic he created? The t statistic.