Biostatistics in Medicine and Public Health: Key Concepts

Biostatistics in Medicine and Public Health

Biostatistics is the science that helps us make sense of data. It can be as simple as summarizing a distribution or identifying whether two variables are correlated. Statistics is crucial for biomedical science as it enables us to make sense of data that have been collected in an experiment or survey and to assess whether the observed result may just have arisen by chance.

  • It tells us how to collect, organize, analyze, and interpret the collected numerical data.

Main Method

The probability theory comprises the basic tools in the study and application of statistical methods.

Main Content

  • Analysis: Judgment of various factors.
  • Decision: Determining which method is the most favorable.
  • Prediction: Assessing the success of treatment via probability.

Statistics Divisions

  • Descriptive Statistics: This is the older branch. It organizes and summarizes numerical information.
  • Inductive Statistics: This is the newer branch. It involves analysis, hypothesis testing, etc.

Fundamentals of Statistics

Research is almost always carried out with samples from a population with the aim of drawing some conclusions about the whole population. Understanding the relationship between the sample and the population is a key function of statistics and helps us to understand the results.

It is important to determine whether the sample provides a good estimate of the true incidence/occurrence in the population as a whole. To answer this point, two answers are required:

  • How big is the sample?
  • Was it representative of the population, or were there possible biases?

Population

A random sample is drawn for observations from a larger population.

  • Statistical Problem: The conclusion drawn about the population is based upon the information gathered from the sample.
  • Entire Population (N): All measurements, all patients, all observations of interest according to the definition of the statistical unit – the sum of all units.

N = size of the population, number of all units.

Sample

  • Random Sample = Representative Sample (n): A representative part of the population that represents the population.

It is determined completely by chance and used to estimate the properties of the entire population.

  • Sampling: The process of selecting a statistically determined number of subjects from the universe that provides an accurate estimate of the problem being studied.
  • Each unit should be defined objectively, locally, and in time.

N: size of the entire population = number of all units.
n: size of the sample.

Variables

  • An element, feature, or factor (attribute) that can vary or change.
  • Variables may be qualitative (having distinct values or classes) or quantitative (having a numeric value).
  • Qualitative Variables: These take on distinct values or classes and may be:

Types of Qualitative Variables

  • Categorical: (e.g., whether someone travels to work by car, bus, train, foot, bicycle, or motorcycle); these are called nominal data.
  • Ordered Categorical: (e.g., whether someone is a non-smoker, light, moderate, or heavy smoker or has low, medium, or high blood pressure); these are called ordinal data as there is an order to the classes.
  • Binary: A special case of variables that take on one of two possible values (e.g., true/false, male/female, survived/died).

Quantitative Variables

These take on numeric values.

  • Random Variable: When the numerical value of a variable is determined by a chance event. Random variables can be discrete or continuous.
  • Discrete: Take on integer values, usually the result of counting (e.g., number of patients in a study, number of cases of disease, number of beds).
  • Continuous: Can take on any value within a range of values (e.g., temperature, blood pressure, height).

Properties of Each Unit (Sample = n)

  • Common: Identified, determined in time, locally, and objectively.
  • Investigated: Observed (e.g., diagnosis, smoking, drug abuse, etc.).