Essential R Programming Syntax and Data Analysis Techniques

R Programming Fundamentals

  • n <- (x): Store a value in a variable.
  • c(): Combine values into a vector.
  • as.type(c()): Change data type (e.g., as.numeric, as.character).
  • rm(variable): Remove from memory; rm(list=ls()) deletes everything.
  • ls(): List all variables in memory.
  • ==: Equality operator.
  • NaN: Not a Number (undefined mathematical operation).
  • NA: Missing value.
  • Vector indexing: Use [x] for a single value or [x:y] for a range.

Conditional Logic and Loops

Marketing Campaign Classification

conversion <
Read More

Statistical Hypothesis Testing and JMP Analysis

1. Background, Problem Statement & Goals

This section aligns with the 4M framework: Motivation → Method → Mechanics → Message.

A. Define the Business Problem

Clearly explain the question you are trying to answer. Examples:

  • How can we estimate home prices?
  • Which marketing channel generates the highest sales?
  • Does employee experience affect salary?

B. Identify the Response Variable (Dependent Variable)

The response variable is what you are predicting or explaining (e.g., Sales Revenue, Home Price,

Read More

Pearson Correlation and Linear Regression Formulas

Pearson Correlation & Linear Regression Cheat Sheet


Pearson Product-Moment Correlation & Linear Regression

Correlation Definition

Correlation measures the strength and direction of a relationship between two variables.

Warning: Correlation does NOT imply causation.


Types of Correlation

TypeDescriptionGraph Trend
Positive CorrelationVariables increase/decrease togetherUpward slope
Negative CorrelationOne increases while the other decreasesDownward slope
Zero CorrelationNo predictable relationshipRandom
Read More

Mathematical Methods in PDEs and Statistical Analysis

Partial Differential Equations (PDE)

A Partial Differential Equation (PDE) involves a function u(x, y, …) and its partial derivatives.

  • Homogeneous: If every term in the equation contains the dependent variable u or its derivatives. The general solution is simply the Complementary Function (C.F.).
  • Non-Homogeneous: If there is a term that is a function of the independent variables only (f(x, y)). The solution is u = C.F. + P.I. (Particular Integral). Example: ∇2u = f(x, y) (Poisson’s Equation).
Read More

Fundamentals of Statistics: Concepts and Data Analysis

1. Statistics: Descriptive vs. Inferential

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to draw meaningful conclusions and support decision-making.

Comparison of Statistical Methods

BasisDescriptive StatisticsInferential Statistics
MeaningSummarizes and describes dataDraws conclusions about population
PurposeTo present data clearlyTo make predictions/decisions
Data usedUses complete data setUses sample data
TechniquesMean, median, mode, graphsProbability,
Read More

Maximum unambiguous range maximum theoretical range

Interquartile range = range between the first and third quartile.

Cumulative frequency = sum of all frequencies for all values.

Variance = the average of the squared differences from the Mean.

Standard variation = the average of the squared differences from the Mean under a squared root (the same as Variance just under a square root to get rid of the squared unit).

The Range = The distance between two values of which we combine their frequencies to simplify longer datasets.

Quartiles = A division of

Read More