Essential R Programming Syntax and Data Analysis Techniques
R Programming Fundamentals
- n <- (x): Store a value in a variable.
- c(): Combine values into a vector.
- as.type(c()): Change data type (e.g., as.numeric, as.character).
- rm(variable): Remove from memory; rm(list=ls()) deletes everything.
- ls(): List all variables in memory.
- ==: Equality operator.
- NaN: Not a Number (undefined mathematical operation).
- NA: Missing value.
- Vector indexing: Use
[x]for a single value or[x:y]for a range.
Conditional Logic and Loops
Marketing Campaign Classification
conversion < Read More
Statistical Hypothesis Testing and JMP Analysis
1. Background, Problem Statement & Goals
This section aligns with the 4M framework: Motivation → Method → Mechanics → Message.
A. Define the Business Problem
Clearly explain the question you are trying to answer. Examples:
- How can we estimate home prices?
- Which marketing channel generates the highest sales?
- Does employee experience affect salary?
B. Identify the Response Variable (Dependent Variable)
The response variable is what you are predicting or explaining (e.g., Sales Revenue, Home Price,
Read MorePearson Correlation and Linear Regression Formulas
Pearson Correlation & Linear Regression Cheat Sheet
Pearson Product-Moment Correlation & Linear Regression
Correlation Definition
Correlation measures the strength and direction of a relationship between two variables.
Warning: Correlation does NOT imply causation.
Types of Correlation
| Type | Description | Graph Trend |
|---|---|---|
| Positive Correlation | Variables increase/decrease together | Upward slope |
| Negative Correlation | One increases while the other decreases | Downward slope |
| Zero Correlation | No predictable relationship | Random |
Mathematical Methods in PDEs and Statistical Analysis
Partial Differential Equations (PDE)
A Partial Differential Equation (PDE) involves a function u(x, y, …) and its partial derivatives.
- Homogeneous: If every term in the equation contains the dependent variable u or its derivatives. The general solution is simply the Complementary Function (C.F.).
- Non-Homogeneous: If there is a term that is a function of the independent variables only (f(x, y)). The solution is u = C.F. + P.I. (Particular Integral). Example: ∇2u = f(x, y) (Poisson’s Equation).
Fundamentals of Statistics: Concepts and Data Analysis
1. Statistics: Descriptive vs. Inferential
Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to draw meaningful conclusions and support decision-making.
Comparison of Statistical Methods
| Basis | Descriptive Statistics | Inferential Statistics |
|---|---|---|
| Meaning | Summarizes and describes data | Draws conclusions about population |
| Purpose | To present data clearly | To make predictions/decisions |
| Data used | Uses complete data set | Uses sample data |
| Techniques | Mean, median, mode, graphs | Probability, |
Maximum unambiguous range maximum theoretical range
Interquartile range = range between the first and third quartile.
Cumulative frequency = sum of all frequencies for all values.
Variance = the average of the squared differences from the Mean.
Standard variation = the average of the squared differences from the Mean under a squared root (the same as Variance just under a square root to get rid of the squared unit).
The Range = The distance between two values of which we combine their frequencies to simplify longer datasets.
Quartiles = A division of
Read More