Statistical Inference: Sampling and Confidence Intervals

Chapter 9: Sampling Distributions

AD_4nXfRcElAPBL2dkpgITWia_2YORkU4jen0Kc0uRf-7Np2TdK5imBRcmlAaemCemawGJMMjKWZcI3ay5Kt2ZfeBFFUoUehSOe7fNE03Pk8DWJUPbsSjRB38SfP11Hrho5V4kKvC_TpOA?key=A0C-ePRN4NNNgOkXxBBprg


AD_4nXfyvWfOb42V2zJIWq7j7hTXZCx6ZmGSxdDobtCguKlawzPIZdNhP07F_NqL-4Gr7S73UkLp0d7zIUu7eIZfkFnL9Q4-hzkq2IlbkTAqpV8Y5D4-2p4OP8m88YBiXeJkQYiJ1GA3lw?key=A0C-ePRN4NNNgOkXxBBprg

Quantile-Quantile Plot (QQ-Plot)

Empirical Rule: This property states that approximately 68%, 95%, and 99.7% of data falls within 1, 2, and 3 standard deviations of the mean, respectively.

AD_4nXdKgOndzBU1_9M0VJgWpgBmDHmP_b2naUhIjeJDxnVrYAK_71o47za4uuyX3-s2hvA7c6qIDItph9npoteAPXzKCSLacLx41r38s7ZUooAtlb3s1z6ny5Y2yANNdtyzhc6_dtc4?key=A0C-ePRN4NNNgOkXxBBprg

Standard Normal Distribution

The Standard Normal distribution has a mean of 0 and a standard deviation of 1.

AD_4nXdPvINv2xcZRNSKguRquWoEF82pr7Mc69EkrtV4_w5r4z8vzjcBgPKs6tdQ8LJ4l9FYqGPiw4TJ-21rZAfmqdJqFKjlTBfw8RiGwiYfqdSPexJgyMwGvVgChqMNvpsxEMcDbgUMTQ?key=A0C-ePRN4NNNgOkXxBBprg

Example: If you want to know the percentage of babies that weigh less than 95 ounces at birth, you must first convert the value 95 to a standardized score (STAT).

AD_4nXc5DAv3XjjYLqsGnQt0-fammuWcVTQBwq-OEoSvevdmnMeYfg9_8R44pWwAmKiYypD27IZ32V1smOoVZHgY2GfDT-tlBTRQQs1k--E-F36mmBLRMwp3tBwrMJfLvPzBs6SYX2l8?key=A0C-ePRN4NNNgOkXxBBprg

AD_4nXdvYeSYfBEH_3JZcIeWMGdFuLHnJZ65lBO8NVezXDMNzDrmP97qCD0dvDv1uGdDuYh7jU1tHjyOokpRyo9XfY6DZNM0twuYKDNIrq1pRMHeUx96gYOtrFBsE2br40EdjJJ9zMljAw?key=A0C-ePRN4NNNgOkXxBBprg

Based

Read More

Statistical Analysis and Predictive Modeling in Excel

Descriptive Statistics and Central Tendency

Descriptive statistics are the numbers that summarize a dataset, giving you a quick “snapshot” of its typical values and how much they vary. These are divided into Measures of Central Tendency (the middle) and Measures of Dispersion (the spread).

1. Measures of Central Tendency

These identify the “center” of your data where most values congregate.

  • Mean (Average): The sum of all values divided by the total count. It is the most common measure but is highly
Read More

Statistics Concepts: Variables, Distributions, and Inference

Lesson 1: Variables

  • Explanatory Variable – aka Independent Variable; explains variations in the response variable (x-axis). This is the predictor.
    • Example: “Can quiz scores be used to predict exam scores?” (Explanatory = Quiz scores)
  • Response Variable – aka Dependent Variable; its value is predicted or its variation is explained by the explanatory variable (y-axis). This is the outcome.

Lesson 2: Variable Types and Data Visualization

  • Categorical vs. Quantitative Variables
    • Categorical Variables = names,
Read More

Essential Causal Inference and Econometrics Techniques

Randomized Experiments and Causal Inference

Why are randomized experiments so desirable?
Randomization breaks the link between treatment assignment and confounders, making treated and untreated groups exchangeable. This guarantees unbiased estimates of causal effects (on average) because any differences in outcomes can be attributed to the treatment rather than selection.

Why might we not be able to run a randomized experiment?
They may be unethical (e.g., denying beneficial treatments), infeasible

Read More

Essential Data Science Concepts and Statistical Methods

Data Science Fundamentals

Data Science combines statistics, computer science, and domain knowledge to extract insights from data. The main goal is to uncover hidden patterns, trends, and other valuable information from large datasets to make informed, data-driven decisions. It deals with both structured (e.g., Excel tables) and unstructured (e.g., text, images) data.

The Data Science Lifecycle

  • Problem Definition: Understanding the business question.
  • Data Collection: Gathering data from various sources.
Read More

Statistical Sampling Distributions and Inference Exercises

Review Exercises for Sampling Distributions

8.56 Consider the data displayed in Exercise 1.20 on page 31. Construct a box-and-whisker plot and comment on the nature of the sample. Compute the sample mean and sample standard deviation.

8.57 If X1, X2, …, Xn are independent random variables having identical exponential distributions with parameter θ, show that the density function of the random variable Y = X1 + X2 + … + Xn is that of a gamma distribution with parameters α = n and β = θ.

8.58

Read More