ANOVA, Chi-Square, and Non-Parametric Statistics: A Comprehensive Guide

Posted on May 12, 2024 in Mathematics

Between: each participant goes through a level of the IV. 2-way participants exposed to one level or both IVs, 3-way etc..

Within (repeated measures): all participants exposed to all levels and combinations

1-way ANOVA (1 iv and 1 outcome/dv) : null= means are the same (same as t-test) Exp /omnibus/=they differ

 You can manipulate more than one independent variable
and is an extension of t-test.
 Pairwise t-tests can’t look at several independent variables
and inflate the Type I error rate. (FW ErrorRate)

ass: rs Y,IoC N, Nor Y, Homo Y

SSTotal : Total variability.

SSTreatment: Variability due to the experimental manipulation.
SSError: Variability due to individual differences in performance

F-Ratio = MsTreat/MsError –> if F-ratio > fcritical = reject null

Eta2 wXF45cJEuIPcf+2SRN7TpYAAAAASUVORK5CYII= = SStreat/SStotal (effect size) 0.01S,0.06M,0.14L

Omega2 lmMg7L0Q4j+AIAg43k1cOPtAAAAAElFTkSuQmCC = SStreat – (k-1)MSerror / (SStotal + MSerror) (effect size) 0.01S,0.06M,0.14L

Multiple Comparisons (after running your ANOVA)

Post hoc tests: Compare two treatments at a time (pairwise comparisons).
 Control error rate familywise (FW).
 Is appropriate only when you really don’t know what to
expect (exploratory research) – it’s a lot like fishing for
significance.
3Z. Wang

POST-HOC TEST types:
 Tukey HSD test: Results in a
single number HSD to determines
statistical significance. It is
conservative.
 Bonferroni’s method: Divide PwldwF8AOKyChCpHSp3AAAAAElFTkSuQmCC by
the number of comparisons (c)
and require that each test be
significant at that level (.05/c) . It
is conservative.
 Fisher’s LSD procedure: Results in
a single number HSD to
determines statistical significance.
It is liberal.
 Dunnett’s C: Can be applied in
situations where the variances are
unequal.
ORTHOGONAL CONTRASTS

AXeAGtvtwm4XAAAAAElFTkSuQmCC

sum of weights are = 0 (vertical)

Contrast 1:Placebo ≠ (Low, High) /// Contrast 2: Low≠High

Trend Analysis
 If you have ordered groups, you often will want to know
whether there is a consistent trend across the ordered groups
(e.g., linear trend).

2 -3 way ANOVA : 2 or more iv (each with multiple levels). 1dv

– levels for IVs are fixed (not continuous)

-random assignment of level and iv combinations

Example of a 2-Way ANOVA: (3×2)

Research Question: Does the type of teaching method and student background affect test scores?
Independent Variables:
Teaching Method (with levels such as Traditional, Online, Blended)
Student Background (with levels such as Urban, Rural)
Dependent Variable: Test scores (measured continuously)
Design: Each student is randomly assigned to one combination of Teaching Method and Student Background, and their test score is recorded.
Example of a 3-Way ANOVA: (3x3x2)

Research Question: How do exercise type, diet, and gender affect weight loss?
Independent Variables:
Exercise Type (with levels such as Cardio, Strength Training, No Exercise)
Diet (with levels such as Low Carb, High Protein, No Diet)
Gender (with levels such as Male, Female)
Dependent Variable: Weight loss (measured in pounds)
Design: Participants are randomly assigned to one of the combinations of Exercise Type, Diet, and Gender, and their weight loss is measured after a fixed period.

anova observed effects example: (2-way anova –> 2main effects)

Diet Type (Vegetarian, Non-Vegetarian) and Exercise Regimen (None, Light, Intensive) on weight loss (dv)

A main effect is the effect of an IV on the DV
averaged over the other variable or when the other variable
is ignored. (diet type on weight loss), can be misleading.
An interaction effect means the effect of one independent variable on the dependent variable differs

depending on the level of another independent variable. (diet type and exercise regimen)
A simple effect is the effect of one variable at a specific level
of another independent variable. (diet type on”light exercis”)

2-way anova between subjects summary table: (same calculations)

F5aT4I6MAAAAASUVORK5CYII=

a= levels of iv 1, b = levels of iv 2. N = total sample size. K = # of levels of iv

3-way ANOVA

3-way ANOVA means that there are three IVs and one DV,
and includes:
 Three main effects.
 Three 2-way interactions and one 3-way interaction.
Recommendations: It is difficult to interpret the results of a 3-way interaction.
 Here is the order when checking the results:
1. The 3-way interaction: If it is significant, you can ignore
others; if not, go to the next step.
2. The 2-way interactions: If it is significant, you can ignore
the main effects related to the interaction; if not, go to the
next step.
3. The main effects: If it is not significant, you can ignore the
main effect; if yes, examine the main effect.

l+EagAAAABJRU5ErkJggg==

Repeated measures Anova summary table:

w+1RPTDiW3m9wAAAABJRU5ErkJggg==

Order effects (trial effects)
 The order in which the participant receives a treatment (first,
second, etc.) will affect how the participant behaves.
 It is a big problem with within-subjects designs.
Order effects may be due to:

 1 Practice effects
 After doing the dependent-measure task several times, a
participant’s performance may improve. In a within-
subjects design, this improvement might be incorrectly
attributed to having received a treatment.

2 Fatigue effects
 Decreased performance on the dependent measure due to
being tired or less enthusiastic as the experiment
continues. Fatigue effects could be considered negative
practice effects.
3 Carry-over effects
 The effects of a treatment administered earlier in the
experiment persist so long that they are present even
while participants are receiving additional treatments.
 Carryover effects create problems for within-subjects
designs because you may believe that the participant’s
behaviour is due to the treatment just administered when,
in reality, the behaviour is due to the lingering effects of a
treatment administered some time earlier

 4 Sensitization
 After getting several different treatments and performing
the dependent-variable task several times, participants in
a within-subjects design may become sensitive to what the
hypothesis is.
5 Sequence effects
 If participants who receive one sequence of treatments
score differently than those participants who receive the
treatments in a different sequence, there is a sequence
effect.

ANOVA 2-3-Way_Unequal Sample (between subjects design)

 issue of which type of SS to use for unbalanced
designs is still controversial. Different texts and different
authors offer different recommendations.
 In general, using Type III Sum of Squares (unweighted
mean) is the best and most common approach to analyze
unbalanced designs.
– Unequal sample size: you will not have SS total

l+Y3mnFEspoAAAAASUVORK5CYII=

total unweighted mean = (-0.5670 + 0.0102)/2 = -0.2784

Chi square test (2 types below)Ass: RS Y, IofO N, ExFr N

-Chi-square tests are utilized for analyzing categorical (nominal) data.

-The chi-square (�2χ2
) distribution’s shape is heavily influenced by the degrees of freedom (df).

-As df increases, the �2χ2
distribution becomes more symmetric.

-Both the mean and variance of the �2χ2
distribution are directly related to df; specifically, the

mean is equal to df and the variance is twice the df.

Goodness-of-Fit Test:
-This test evaluates how well the observed frequencies match expected frequencies according to a specific hypothesis.
-It’s often used for one-dimensional data to assess if a single categorical variable follows a distribution pattern.

ASMN6xxFXjcfAAAAAElFTkSuQmCC wIg3rnXJrYaAAAAAElFTkSuQmCC

Contingency Table Test (also known as Test of Independence):
-This test determines if there are significant associations between two categorical variables.
It’s conducted on a two-way table, where each cell represents the frequency for the combination of categories.
-Both tests involve a comparison of observed frequencies (O) with expected frequencies (E).

-Expected frequencies are based on the null hypothesis, assuming no relationship or difference exists,

or they can be derived from theoretical distributions relevant to the research question.

A1wIu+gdYVe1AAAAAElFTkSuQmCC

EFFECT SIZEs
 d-family
Based on one or more measures of the differences between
groups or levels of the independent variable

8cOmWNdUcaBAAAAAElFTkSuQmCC

UznGe8pSnPJ06dDTwITktWSw53Vdm617fHCnX85QnAP8P+Owsm8l2O5wAAAAASUVORK5CYII=

 r-family
Correlation coefficient between the two independent
variables

 Phi KzIeZtdKygxJiOnx1xoqOEONFRQpzoqEDkv9+8Pbi53dKkAAAAAElFTkSuQmCC

It represents the correlation between two variables and
applies only to 2×2 tables

Cramér’s V Kq9z1Y1NwAAAABJRU5ErkJggg==
 Cramér extended phi to larger tables by defining V as the
following, where N is the sample size and k is defined as the
smaller of R and C. When k = 2 the two statistics, phi and V
are equivalent.

Aws4BgEEKKt3AAAAAElFTkSuQmCC

ANCOVA

-ANCOVA is used to test for differences between group means while controlling for the influence of extraneous

variables (ex: your dv is math performance BUT you also measured aptitude before and want to see if it affects as a covariate)
-It adjusts for the effects of confounding variables, providing a clearer picture of the relationship between

the independent and dependent variables.

-To control for the effect of the covariate, it is removed from the
achievement scores by using the regression method. Then the F test can
be performed on the adjusted achievement scores.
Advantages:
-Reduces error variance by accounting for variation in the dependent variable that’s due to extraneous

variables, not the main independent variable(s).
-Enhances experimental control and increases the validity of the results by adjusting

for confounders, allowing for a more accurate assessment of the primary relationships of interest

Conducting ANCOVA in SPSS:

1 covariate =1df (for it)

Check Assumptions:
-Begin by testing the assumption of homogeneity of regression slopes, also known as the

equal slopes assumption. w+fHnFjS31C4gAAAABJRU5ErkJggg==
-This means ensuring that the relationship between the covariate and the dependent variable is

consistent across all levels of the independent variable(s).
Linearity:
Verify the linear relationship between the covariate (aptitude scores) and the dependent variable (achievement scores).
No Interaction:
Confirm there is no significant interaction between the covariate and the independent variable (teaching methods),

as ANCOVA is not appropriate in the presence of a significant covariate by treatment interaction

Mixed design ANOVA

– for Mixed design you need to have at least one between-subjects IV and one within-subjects IV

-ex: having a pretest and a postest (within) for 2 groups (between)

79q6CRaY7r0AAAAAElFTkSuQmCC

Hp+jOZLZ05cAAAAASUVORK5CYII=

Ass: Nor Y, Homo Y, Inde N

BwdJVtd6hDrGAAAAAElFTkSuQmCC

Non parametric stats: (chi square test is one type)

The terms nonparametric and distribution-free, which have
slightly different meanings, are often used interchangeably.
When do we use nonparametric procedures?
 When normality cannot be assumed.
 When there is not sufficient sample size to assess the form of
the distribution

Rank stats:

8DQtRKnCJJEV4AAAAASUVORK5CYII=

w8t3pto7DevKwAAAABJRU5ErkJggg==

6ooLfCN7nKUAAAAAElFTkSuQmCC

OTHER PROCEDURES
 Resampling
 Bootstrapping: A statistical method for estimating the
sampling distribution of an estimator by sampling with
replacement from the original sample, most often with the
purpose of deriving robust estimates of standard errors and
confidence intervals of a population parameter like a mean

ADVANTAGES AND DISADVANTAGES
Advantages
 Get a quick answer with little calculation.
 Appropriate when the sample sizes are small.
Disadvantages
 No parameters to describe and it becomes more difficult to
make quantitative statements about the actual difference
between populations.
 Throw away information.
 Less statistically powerful than their parametric counterparts

USE OF NONPARAMETRIC PROCEDURES
 Each nonparametric procedure has its peculiar sensitivities
and blind spots. It is always advisable to run different
nonparametric tests.
 Should discrepancies in the results occur contingent upon
which test is used, one should try to understand why some
tests give different resultsxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Fundamental concepts and definitions:

Two branches of statistics:
Descriptive statistics
 Summarize and describe a group of numbers.
Inferential statistics
 Try to infer information about a population by using
information gathered by sampling.
Terms
 Population: The complete set of data elements is termed the
population. Large or small; finite or infinite.
 Sample: A sample is a portion of a population selected for further
analysis.
 Parameter: A parameter is a characteristic of the whole population.
 Statistic: A statistic is a characteristic of a sample, presumably
measurable.

Randomness

 Assuming that the sample is truly random, we not only can
estimate certain characteristics of the population, but also
can have a very good idea of how accurate our estimates
are. To the extent that the sample is not random, our
estimates may or may not be meaningful, because the
sample may or may not accurately reflect the entire
population.
 External validity: random selection
 Internal validity: random assignment

STATISTICS & THREE COMMON RESEARCH METHODS
 Experimental Method
 Must satisfy requirements to be an experiment—high level of
control to isolate cause and effect; must manipulate levels of
an independent variable (the proposed cause); random
assignment renders groups equivalent ; comparison/control—
at least two groups observed (dependent variable—proposed
effect).
 Quasi-Experimental Method
 Including a non-manipulated IV (classification variable); lack
of random assignment to groups; and/or lack of control
(comparison) group (e.g., boys vs girls ).
 Correlational Method
 Variables measured as they naturally occur—lack of random
assignment and control to determine cause-effect

Nominal Scale
The nominal scale is the most basic level of measurement, where data can be categorized based on qualitative traits but not ordered. Categories on a nominal scale are mutually exclusive, and no order or ranking can be inferred from the labels.
Example: Categories like race, gender, or marital status are nominal. These categories simply denote different groups without any implied hierarchy or order.
Ordinal Scale
Data measured on an ordinal scale is categorized into groups that can be ranked; however, the intervals between the rankings are not necessarily equal. This means while you can say one rank is higher or lower, you can’t quantify the difference between the ranks.
Example: Educational levels (e.g., elementary school, high school, college, graduate school) are ordinal. They indicate a progression, but the difference in educational attainment between each level is not uniform.
Interval Scale
The interval scale is a numerical scale where the order of the numbers is meaningful, and the intervals between each value are equal. However, it does not have a true zero point, meaning the zero value does not imply the absence of the quantity being measured.
Example: Temperature in degrees Celsius or Fahrenheit is an interval scale because the difference between each degree is the same, but 0 degrees does not mean the absence of temperature.
Ratio Scale
The ratio scale possesses all the properties of an interval scale, and in addition, it has a true zero point. This true zero allows for a meaningful zero value (indicating none of the variable) and the calculation of ratios.
Example: Weight and height are measured on a ratio scale, as they have equal intervals, and a value of zero means there is no weight or height, respectively.

QWmtKPZxtnQAAAABJRU5ErkJggg==

Boxplot
 Describe the centre and
dispersion of datasets
and locate outliers.
 Detect and illustrate
location and variation
changes between.
different groups of data
 Assume that the data
are unimodal. Thus,
never claim that a
distribution is normal
simply on the basis of a
boxplot

The histogram graphically shows the following
 Centre of the data
 Spread of the data (kurtosis; heavy or light tailed)
 Skewness of the data (symmetry)
 Presence of outliers
 Presence of multiple modes in the data (modality, uni-, bi-, or
multi-modal)
 Normality

Outliers: correct them, ignore them, remove them, transform them, replace them

3 a probability theory

three approaches:

Classical (Analytic) Approach
The classical approach to probability is based on the assumption that the outcomes of an event are equally likely. This method calculates the probability of an event as the number of favorable outcomes divided by the total number of possible outcomes.
Example: Consider the rolling of two dice and summing the numbers on the top faces. If interested in calculating the probability of the sum being seven, you count how many combinations (e.g., (1,6), (2,5), (3,4), (4,3), (5,2), (6,1)) out of all possible combinations of two dice result in seven, giving you a probability of 6/36 or 1/6.
Frequentist Approach
The frequentist approach defines the probability of an event as the limit of its relative frequency after a large number of trials. This approach does not require any assumptions about the likelihood of individual outcomes but instead relies on actual experimental or observed data.
Example: If you flip a coin many times and observe that it lands heads 500 times out of 1,000 flips, you might estimate the probability of getting heads as 0.5, based on these empirical results.
Subjective Approach
The subjective approach to probability is based on personal judgment or belief about the likelihood of an event occurring. This approach is often used when it is not possible to apply classical or frequentist probability due to a lack of symmetry in outcomes or insufficient empirical data.
Example: If someone says, “I think there is a 70% chance that tomorrow will be a good day,” they are expressing a subjective probability based on personal belief, not on a statistical or empirical model.

Mutually exclusive or disjoint
 If two events can not happen at the same
time. One event affects the other (e.g., tossing a coin twice : Event A =
two heads; Event B = two tails).
 Probability rule—Addition Rule
 If A and B are mutually exclusive or disjoint
events, then p(A or B) = p(A) + p(B). E.g., the
probability of drawing either a heart or a
spade from a deck of playing cards = 13/52 +
13/52 – 0/52 = 26/52.
 If A and B are not mutually exclusive or
disjoint events, then p(A or B) = p (A) + p (B)
– p (A and B). E.g., the probability of
drawing a heart or a 3 from a deck of
playing cards = 13/52 + 4/52 – 1/52 = 16/52

INDEPENDENT
 Independent
 Events A and B are independent if knowing that A occurs
does not affect the probability that B occurs (e.g., tossing two
coins: Event A = the first coin is a head; Event B = the second
coin is a head).
 Probability rule—AND/Multiplication Rule
 If A and B are independent, then p(A and B) = p(A)*p(B).
Joint probability
 In N independent trials, suppose NA, NB, NAB denote the
number of times events A, B and AB occur respectively.
According to the frequency interpretation of probability,
for large N. p(A) = NA/N, p(B) = NB/N, p(A and B) = NAB/N

+OIfBKxn0kjFLEARBEHsI8P8DzyXTAkLZ7lMAAAAASUVORK5CYII=

Disjoint events cannot be independent! If A and B can not occur together
(disjoint), then knowing that A occurs does change probability that B occurs.

Normal distribution is theoretical.

The theoretical distribution with data that are symmetrically distributed
around the mean, median, and mode.
 Bell-shaped and unimodal
 Mean = median = mode
 Asymptotic tails
 Area under the curve = 1

EBivmhx2bLgzo79ewmBd21wZwe8QEIzkMdExIkB2XaX9JjiTOXKMOTmwh1rytaEwITVpmceIYmSFiGPiSAIzUEeE0EQmoOEiSAIzUHCRBCExgD+H5L7CbteWLmzAAAAAElFTkSuQmCC

Defined by Two Parameters: The shape of the normal distribution is determined by two parameters—

mean (μ) and standard deviation (σ).

-The mean determines the center of the distribution, and the standard deviation

determines the spread or width of the distribution

-The theoretical distribution of the hypothetical set of
sample means obtained by drawing an infinite number of
samples from a specified population can be shown to be
approximately normal under a wide variety of conditions.
Such a distribution is called the sampling distribution of
the mean

sZIuy2kmvKEAAAAASUVORK5CYII=

Z score

85t1Ubu6W35lRoeyWlaygPsYbdFrGG8hBrKA+xhvIQaygPsYbyEGsoD7GG8hBrKA+xhvIQaygPsYbyEEuA38fkkPFc6LYWAAAAAElFTkSuQmCC

Why use Z scores?
 Can compare distributions with different means and
standard deviations.
 Used to “STANDARDIZE” variables – allows variables to be
compared to one another even when measured on different
scales.

Bill has an IQ of 145, IQ in the population has a mean of 100 and a
standard deviation of 15

145-100 / 15 = 3 (bills iq score is 3 std higher than mean score)

sZkDGQ7eOeAB50HAAgAAAIaHNiwAAABgeAhYAAAAwPAQsAAAAIDhIWABAAAAw0PAAgAAAIaHgAUAAAAMDwELAAAAGB4CFgAAADA4ov8D2NWcUbeoWjUAAAAASUVORK5CYII=

modality: how any peaks (first 2) (can be assesed with qq plots)

skewness: positive or negative (3 and 4)

kurtosis: deviation form the normal distribution

(last 2 /platykurtic – scores more in the middle and leptokurtic-more peak and longer tails/)

Normality (first graph): if you divide kurtisis and skew in spss by the std error next to them, if score is > 0.05 then not normal

SAMPLING & HYPOTHESIS TESTING
CONTROVERSY

The standard error of mean is the standard deviation of the
distribution of sample means

D0Sqo7r63603AAAAAElFTkSuQmCC

f9rX8SgakNjawAAAABJRU5ErkJggg==

Type 1 error: false positive (mistakenly reject H0)

Type 2 : false negative (fail to reject H0)

vKGMELvqkqLhdGZWH5TOVjn+JgJa+qmNNnKh8+Ghi+ocKHjwaFyP8DAZMJTxq9GKYAAAAASUVORK5CYII=

Here are factors that affect the power of a test:
 the size of alpha
 1 or 2 tailed test
 separation of mean0 and mean1 (Effect Size?)
 the size of the population variance, TANX84f9hTf4FG4boCz9x+TlTk39VAAAAAElFTkSuQmCC
 the sample size, n

Unified approach to null hypothesis testing

wHNT7AwziiEqgAAAABJRU5ErkJggg==

0p9QKBQKhUKh2JonZGcoFAqFQqH4kVFOhEKhUCgUip1QToRCoVAoFIqdUE6EQqFQKBSKHQD+P26tnUK+EnqXAAAAAElFTkSuQmCC

Point estimation
 Use a sample statistic (e.g., a sample mean) to estimate a
population parameter (e.g., a population mean).
 Advantage: It is an unbiased estimator, that is, the sample mean
will equal the population mean on average.
 Disadvantage: Have no way of knowing for sure whether a
sample mean equals the population mean. For this reason,
researchers often report a point estimate with an interval
estimate.

Interval estimation
 Confidence interval (CI)—Interval or range of possible values
within which an unknown population parameter is likely to
be contained.
 Level of confidence—Probability or likelihood that an interval
estimate will contain an unknown population parameter (e.g.,
population mean) (e.g., 95% CI [2.35 – 27.65])

-If the value specified in Ho fell outside the confident interval,
H0 should be rejected (otherwise it is kept)

T-tests

When std (weird circle with tail) is known
 One sample z test
When std is not known
 One sample t test (IofOb N, Nor Y, RanS Y)

Purpose: Compare a sample mean to a known population mean when population variance is unknown.
Theory: Assumes the sample is drawn from a normally distributed population.
 Independent samples t test (IofOb N +Homo Y)

Purpose: Compare means between two independent groups.
Theory: Assumes both samples are from normally distributed populations.

Uses the pooled standard deviation if variances are assumed equal, with t-statistic calculated
 Dependent samples t test (IofOb N, + Corr N)

Purpose: Compare means between two related groups (matched pairs or repeated measures).
Theory: Assumes differences within pairs are normally distributed. Calculates differences for each pair, then computes the t-statistic

GLM

aF5chrHjKtwAAAABJRU5ErkJggg==

Ass: InofCases N, Lin N, Nor Y, RanS Y

FACTORS INFLUENCING THE PEARSON R: Linearity and Outliers, Restriction of range

wOErPNDfoGmNwAAAABJRU5ErkJggg==

Simple and logistic regressions:

Logistic Regression
-Purpose: Used to predict the probability of a categorical dependent variable based on one or more predictor variables.

This method is suitable when the outcome variable is binary or multinomial (e.g., success/failure, yes/no).
-Model: The relationship between predictors and the log odds of the outcomes is modeled using a logistic function, which is non-linear.

Key Metrics:
Odds Ratio (Exp(B)): Represents the change in odds for a unit increase in the predictor.
Wald Statistic: Tests the significance of each coefficient.
Nagelkerke R2: Measures the proportion of variance explained by the model.
Application: Common in medical fields for binary outcomes, market research for choice modeling, and any field where outcomes are categorical.

Simple Regression
Purpose: Used to predict a continuous dependent variable from a single predictor variable.
Model: Assumes a linear relationship between the predictor and the outcome.

Key Metrics:
R-squared (r²): Represents the proportion of variance in the dependent variable that is predictable from the independent variable.
Regression Coefficient (B): Indicates the change in the dependent variable for a one-unit change in the independent variable.
Standardized Coefficients: Provide a measure of the strength of the relationship between the variables, expressed in standard deviation units.
Application: Widely used in economics, business, and social sciences to understand relationships between variables and for forecasting.

Key Differences
Outcome Variable: Logistic regression is used for categorical outcomes, whereas simple regression is used for continuous outcomes.
Function Shape: Logistic regression uses a logistic curve, non-linear, to model the probability of an event. In contrast, simple regression uses a straight line to model the relationship.
Interpretation of Coefficients: In logistic regression, coefficients represent the change in log odds per unit change in the predictor,

while in simple regression, coefficients represent the change in the outcome variable per unit change in the predictor.

ANOVA, Chi-Square, and Non-Parametric Statistics: A Comprehensive Guide

Recent Notes

Subjects

Publicidad