Probability and Statistics: Key Concepts
Sample Space (Ω)
The collection of all possible outcomes of an experiment.
Empty Set (∅)
Indicates that an event E is impossible.
Empirical Probability (EP)
- What is observed.
- Data collection.
- Estimation of some unknown truth.
Theoretical Probability (TP)
What is expected – calculated using mathematical reasoning or computation.
Law of Large Numbers
EP approaches TP as n approaches infinity.
Union of Joint Events (A ∪ B)
A ∪ B: Everything is shaded (“A or B”).
Joint: if they intersect or overlap.
Union of Disjoint Events
A or B will occur on a single measurement.
Disjoint: if they DO NOT intersect or overlap.
Probability of a Union
For A and B normally:
For mutually exclusive events:
Independence of Events
Events are independent if the occurrence of one event does not change the probability of another event and statistics.
Intersection (A ∩ B)
A ∩ B is the set of events in both A and B.
Example: A = {1, 2, 3}, B = {2, 3, 4}, A ∩ B = {2, 3}.
Difference of Two Events (A\B)
A\B is any event in A that is not an event in B where A \ B = A – B. Note that A \ B ≠ B \ A and A \ B = A – A ∩ B.
Axioms of Probability
- Probability is non-negative.
- The sum of all probabilities should be 1 or 100%.
- For any disjoint events:
Complement (Ec)
Ec or E with a horizontal bar is the event that occurs when E does not occur.
Sequential System
Failure of one component causes failure of the whole system. ALL components MUST WORK.
Parallel System
If at least one component works, the whole system works.
Simpson’s Paradox
- Ignoring the subgroups shows a line with a downward trend through the entire group.
- Accounting for the subgroups shows a positive trend within each group.
- The trend reverses!
Proving Independence
To prove independence, P(A) * P(B) = P(A∩B). If not, then they are not independent.
Joint Probability
P(A∩B) = P(A|B) × P(B) OR P(A∩B) = P(B|A) × P(A)
Conditional Probability (A given B)
P(A|B) = P(A∩B) / P(B) OR P(B|A) = P(A∩B) / P(A)
Factorial (!)
Means factorial. Example: 4! = 4 * 3 * 2 * 1
Double Factorial (!!)
Example: 8!! = 8 * 6 * 4 * 2
Bayes’ Rule
P(A|B) =
(Use when conditional probability under multiple conditions)
P(A ∩ B) is a joint probability. P(A|B) and P(B|A) are conditional probabilities. P(A) and P(B) are marginal probabilities.
Law of Total Probability
If events are mutually exclusive and partition Ω, then:
Let B be an event with non-zero probability, then the statement of Bayes’ Theorem is:
Law of Total Probability with Bayes’ Theorem
Permutation
Selecting k items from a set of n items where order *does* matter and repeating is not allowed. Without replacement:
Permutation with replacement:
Combination
Selecting k items from a set of n items where order *does not* matter and repeating is not allowed. Without replacement:
With replacement:
Fundamental Theorem of Counting (FTC)
Used to calculate the number of different ways that “tasks” can be performed.
Binomial Distribution
Random Variable
A random variable is a function of an outcome.
Range of a Random Variable
The set of possible values of a random variable (RV) is called the range of the RV.
Discrete Random Variable
When the range is a countable set. Any finite set, the natural numbers, the integers, and the rational numbers are each a countable set. (For example: The number of days it will rain next week is a random variable. This random variable could be any whole number from 0 to 7.)
Probability Distribution
Describes the values a random variable could be, and the probabilities of each value.
Probability Mass Function (PMF)
The function that maps the possible outcomes to the probability of each outcome.
Cumulative Distribution Function (CDF)
Add the probability of an outcome and the probabilities of all smaller outcomes to find the cumulative probability of the outcome.
for
Mean Deviation
Average = sum(each outcome * its probability)
Poisson Family/Distribution
E(x) = Var(x) = λ
λ (lambda) – used for the variable that controls how the Poisson distribution works.
Standard Deviation
Measures how far off the actual outcomes are likely to be from the average outcome.
Geometric Distribution