Sampling, Data Collection, Analysis, and Reporting in Research

Summary 8: Sampling in Research

1. This chapter describes how to select a sample. The first thing you have to ask is who is going to be measured, corresponding to defining the unit of analysis—individuals, organizations, or newspapers. You should then clearly define the population based on the objectives of the study and in terms of content features, place, and time.

2. The sample is a subset of the population previously defined and may be probabilistic or non-probabilistic.

3. Choosing what type of sample is required depends on the objectives of the study and the research scheme.

4. Probability samples are essential in survey research designs that seek to generalize the results to a population. The characteristic of this type of sample is that all elements of the population initially have the same chance of being selected, so the sample items have values very close to those values of the population, as measurements of the subset will be very precise estimates of the larger whole. This accuracy depends on sampling error, also called standard error.

5. For a probability sample, you need two things: to determine the size of the sample and to select items in a random sample.

6. The sample size is calculated based on the population variance and sample variance. The latter is expressed in terms of the probability of occurrence. The population variance is calculated as the square of the standard error, which we determine. The lower the standard error, the larger the sample size.

7. Probability samples include simple, stratified, and cluster. Stratification increases the precision of the sample and involves the deliberate use of sub-samples for each stratum or class that is relevant to the population. The sample of clusters leads to differences between the unit of analysis and the sampling unit. In this type of sampling, there is a selection in two stages, both probabilistic procedures. In the first, clusters are selected—schools, organizations, or classrooms—and in the second, within the clusters, the subjects that will be measured are chosen.

8. The items sampled from a probability sample are always chosen randomly to make sure every item has an equal chance of being selected. Selection procedures can include: 1. Tombola, 2. Random number table, and 3. Systematic selection. All recruitment listings depend on whether they are existing or constructed ad hoc. Listings include the telephone directory, lists of associations, lists of formal schools, etc. When there are no lists of elements of the population, other frameworks containing descriptions of the material, organizations, or subjects selected as units of analysis are used. Some of these may be files, archives, and maps.

9. Non-probability samples can also be called targeted samples, since the choice of subjects or objects of study depends on the discretion of the investigator.

10. Targeted samples can be of various kinds: (1) Sample-volunteer subjects frequently used in experimental designs and lab situations. (2) Display of experts—often used in exploratory studies. (3) Sample-type subjects or case studies used in qualitative and motivational research, and (4) frequent sampling procedures in opinion polls and marketing. Targeted samples are valid in that a particular research design needs them; however, the results are generalizable to the sample itself or similar samples. They cannot be generalized to a population.

11. The central limit theorem states that a sample of over a hundred cases will be a sample with a normal distribution and characteristics, but it should not be a normal joint probability. While the former is necessary for statistical testing, the second is a prerequisite for correct inferences about a population.

Summary 9: Data Collection

1. Collecting data involves selecting a measuring instrument available or developing our own, applying the measuring instrument, and preparing for the measurements obtained so they can be analyzed correctly.

2. Measurement is the process of linking abstract concepts with empirical indicators through sorting and/or quantification.

3. In all research, we measure the variables in the hypotheses.

4. A measuring instrument must meet two criteria: reliability and validity.

5. Reliability refers to the degree that the repeated application of a measuring tool to the same subject or object produces the same results.

6. Validity refers to the degree that a measurement instrument really measures the variable(s) it purports to measure.

7. Three types of evidence for validity can be provided: content-related evidence, criterion-related evidence, and construct-related evidence.

8. Factors that may affect validity primarily include improvisation, using instruments developed abroad that have not been validated in our context, little or no empathy, and application factors.

9. There is no perfect measure, but measurement error should be reduced to tolerable limits.

10. Reliability is determined by calculating a coefficient of reliability.

11. Reliability coefficients range between zero and 1 (0 = no confidence, 1 = total confidence).

12. The most common procedures for estimating reliability are the measure of stability, the alternate forms method, the method of half games, Cronbach’s alpha, and KR-20.

13. Content validity is obtained by comparing the universe of items against items present in the meter.

14. Criterion validity is obtained by comparing the results of applying the measurement tool against the results of an external criterion.

15. Construct validity can be determined through factor analysis.

16. The generic steps to build a measuring instrument are:

  • List the variables to be measured.
  • Review conceptual and operational definitions.
  • Choose an already developed instrument or build your own.
  • Indicate levels of measurement of variables (nominal, ordinal, interval, and ratio).
  • Indicate how you have to encode the data.
  • Implement a pilot test.
  • Construct the final instrument.

17. In social research, we have several different measurement tools:

  1. Major attitude scales: Likert, Guttman, and Semantic Differential.
  2. Questionnaires (self-administered by personal interview, telephone interview, and by mail).
  3. Content analysis.
  4. Observation.
  5. Standardized tests (standard procedure).
  6. In-depth sessions.
  7. Records and other forms of measurement.

18. The responses are coded.

19. Codification involves:

a) Encoding or equivalent non-precoded items.

b) Preparing the codebook.

c) Performing physical coding.

d) Recording and storing data in a permanent file.

Summary 10: Data Analysis

1. Data analysis is performed using the data matrix, which is stored in a file. 2. The type of analysis or statistical tests to perform depends on the level of measurement of variables, hypotheses, and research interest. 3. The statistical analysis may be performed using descriptive statistics for each variable (frequency distribution, measures of central tendency, and measures of variability), the transformation of scores to z-scores, ratios and rates, statistical calculations for inferential statistics, parametric tests, nonparametric tests, and multivariate analysis.

4. Frequency distributions contain the categories, codes, absolute frequencies (number of cases), relative frequencies (percentages), and cumulative frequency (absolute or relative).

5. Frequency distributions (particularly talking about the relative frequencies) can be presented graphically.

6. A frequency distribution can be represented through the frequency polygon or frequency curve.

7. The measures of central tendency are the mode, median, and average.

8. The measures of variability are the range (difference between the maximum and minimum), standard deviation, and variance.

9. Other useful descriptive statistics are skewness and kurtosis.

10. Z-scores are transformations of the values obtained to standard deviation units.

11. A ratio is the relationship between two categories, and a rate is the ratio between the number of cases of a category and the total number of cases multiplied by a multiple of 10.

12. Inferential statistics is used to make generalizations from the sample to the population; it is used to test hypotheses and estimate parameters. It is also based on the concept of sampling distribution.

13. The curve or normal distribution is a very useful theoretical model; its mean is 0 (zero) and its standard deviation is one (1).

14. The level of significance and confidence intervals are levels of probability of making an error or mistake in hypothesis testing or parameter estimation. The most common levels in the social sciences are those of .05 and .01.

15. Analysis or parametric statistical tests commonly used are:

Types of hypothesis testing

  • Pearson correlation coefficient: correlational
  • Linear regression: correlational/causal
  • T-test: group difference
  • Contrast the difference in group proportions: difference
  • Analysis of variance (ANOVA): one-way and factorial. Unidirectional (one independent variable) and two-factorial (two or more independent variables): difference between groups/causal
  • Analysis of covariance (ANCOVA): correlational/causal

16. In all parametric statistical tests, the variables are measured at an interval or ratio level.

17. The analysis or nonparametric statistical tests commonly used are:

  • Chi-square: difference between groups
  • Spearman and Kendall coefficients: correlational

20. Statistical analysis is performed using computer programs with statistical packages.

21. The most popular statistical packages are BMDP, ESP, OSIRIS, SAS, and SPSS. These packages are used to consult the respective manual.

Summary 11: Research Report Preparation

1. Before preparing a research report, the user should be defined, as the report will be adapted to them.

2. Research reports can be presented in an academic context or in a non-academic context.

3. The context determines the format, nature, and extent of the research report.

4. The most common elements of a research report presented in an academic report are: title page, index, abstract, introduction, theoretical framework, method, results, conclusions, bibliography, and appendices.

5. The most common elements in a non-academic report are: title page, index, abstract, introduction, methods, results, conclusions, and appendices.

6. Various props can be used to present the research report.

Summary 7: Non-Experimental Research

1. Non-experimental research is performed without deliberately manipulating independent variables; it is based on variables that have already occurred or were in reality without the direct intervention of the investigator. It is a retrospective approach.

2. Non-experimental research is also known as ex post facto research (facts and variables have already occurred), and observed variables and relationships between them in their natural context.

3. Non-experimental designs are divided as follows:

  • Transactional
  • Longitudinal

4. Transactional designs make observations at a unique moment in time. When individually measured variables are reported, they are descriptive. When describing relations between variables, they are correlational, and when they establish causality between variables, they are correlational/causal.

5. Longitudinal designs make observations at two or more times or points in time. If you are studying a population, they are trend designs; if you look at a specific group or subpopulation, they are evolutionary analysis designs of the group; and if they study the same subjects, they are panel designs.

6. Non-experimental research is less rigorous than experimental research, and it is more difficult to infer causal relationships. But non-experimental research is more natural and closer to everyday reality.

7. The type of design to choose is influenced by the research problem, the context surrounding the research, and the type of study.