Statistical Inference and Regression Analysis: Key Concepts
Statistical Inference and Regression Analysis
Statistical Inference: Drawing conclusions about a population based on information from a sample.
Standard Error: Measures the variability of the sample mean estimate, calculated based on the standard deviation of the sample and the sample size.
Hypothesis Test Decisions:
- Reject the null hypothesis
- Fail to reject the null hypothesis
Multiple Linear Regression
A regression model that estimates the relationship between two or more independent variables and a dependent variable.
- Coefficient: A numerical value representing the strength and direction of the relationship between an independent variable and the dependent variable.
- Intercept: The expected value of the dependent variable when all independent variables are zero.
- R-squared: A measure indicating the proportion of variance in the dependent variable predictable from the independent variables.
- Adjusted R-squared: A modified R-squared that adjusts for the number of predictors.
- Standard Error of the Estimate: Measures the accuracy of predictions, representing the average distance of observed values from the regression line.
- Residuals: The differences between observed and predicted values, used to assess model accuracy.
- Homoscedasticity: The assumption that the variance of residuals is constant.
- Multicollinearity: High correlation between independent variables, making it difficult to determine individual effects.
- Outliers: Observations far from other data points, which can influence regression results.
- Interaction Term: Assesses if the effect of one independent variable changes based on another.
Hypothesis Testing Concepts
- P-value: The probability of obtaining test results as extreme as observed, assuming the null hypothesis is true.
- Significance Level (α): The threshold to reject the null hypothesis, commonly 0.05 or 0.01.
- Confidence Interval: A range of values believed to contain the true population parameter.
- Test Statistic: A standardized value calculated from sample data, compared to a critical value.
- Critical Value: The threshold value to reject the null hypothesis.
- One-Sided Test: Tests for a directional effect (greater than or less than).
- Two-Sided Test: Tests for any difference, without specifying direction.
Statistical Tests
- Z-Test: Used to compare means when population variance is known and sample size is large.
- T-Test: Used to compare means when population variance is unknown or sample size is small.
- ANOVA: Used to compare means across multiple groups.
- Power of a Test: The probability of correctly rejecting a false null hypothesis.