Statistical Analysis: Variables, Data, and Inference

Posted on Jan 11, 2025 in Statistics

Variables and Study Groups

Categorical variables
Quantitative variables
Explanatory variable
Response variable

Study Groups –> Population

–> sample

Sampling and Data Collection

Sample:

Statistical Inference
Sampling Bias
Random Sample
Association vs. Causation
Confounding Variables

Collecting Data:

Experiment
Observational Study
Randomized Experiment
Control Group
Placebo
Blind Experiment
Double-Blind Experiment
Randomized Comparative Experiment
Matched Pairs

Describing Data

Standard Deviation
95% Rule
Z-Score
5-Number-Summary
Range
IQR
Boxplot
Side-by-side plots
Scatterplot
Direction of Association
Linear Correlation
Correlation
Regression Equation
Prediction
Residual
Causation
Simpson’s Paradox
Augmented Scatterplot
Visualization w/ 2 or more variables

B8cW4yMilNjwAAAAAElFTkSuQmCC

3.1 Sampling Distributions

•Sample statistics vary from sample to sample

•Sampling Distribution shows how much the sample statistic varies from sample to sample

•If samples are randomly selected, the sampling distribution will be centered around the population parameter

•For most of the statistics we consider, if the sample size is large enough, the sampling distribution will be symmetrical and bell-shaped

•If you take random samples of the same size, from the same population, the sampling distribution will be centered around the true population parameter

•If sampling bias exists, your sampling distribution can give you bad information about the true parameter

•Variability of the statistic……

•Standard Error is the standard deviation of the sample statistic

•As the sample size increases, the variability of the sample statistics tends to decrease and the sample statistics tend to be closerto the true value of the population parameter!

•For larger sample sizes, you get less variability in the statistics, so less uncertainty in your estimates

•Statistical inference is drawing conclusions about a population based on a sample

•We use a sample statistic to estimate a population parameter

•To assess the uncertainty of a statistic, we need to know how much it varies from sample to sample

•To create a sampling distribution, take many samples of the same size from the population, and compute the statistic for each

•Standard error is the standard deviation of a statistic

3.2 Confidence Intervals

•The larger the standard deviation of the sampling distribution, the greater the spread in the distribution of sample statistics. That means that there is a high uncertainty surrounding any single statistic and our margin of error will be large.

•A confidence intervalfor a parameter is an interval computed from sample data by a method that will capture the parameter for a specified proportion of all samples

•The success rate (proportion of all samples whose intervals contain the parameter) is known as the confidence level

•A 95% confidence interval will contain the true parameter for 95% of all samples

•If the sampling distribution is relatively symmetric and bell-shaped, a 95% confidence interval can be estimated using statistic ± 2 × SE (and we know 2xSE = ME)

•Misinterpretation 1: “A 95% confidence interval contains 95% of the data in the population”

•Misinterpretation 2: “I am 95% sure that the mean of a sample will fall within a 95% confidence interval for the mean”

•Misinterpretation 3: “The probability that the population parameter is in this particular 95% confidence interval is 0.95”

•To create a plausible range of values for a parameter:

o Take many random samples of the same size from the population, and compute the sample statistic for each sample

o Compute the standard error as the standard deviation of all these statistics

o Use statistic ± 2´SE

Statistical Analysis: Variables, Data, and Inference

Variables and Study Groups

Sampling and Data Collection

Describing Data

3.1 Sampling Distributions

3.2 Confidence Intervals

Recent Notes

Subjects

Publicidad