Frequency Distribution and Statistical Data Analysis of Student Heights

Statistical Data from Quantitative Variables

We use statistical methods to describe data derived from quantitative variables.

Primitive Table of Student Heights

Table 5.1 shows the raw, unorganized height data (in cm) of a group of students. This type of table, with numerically unorganized elements, is called a primitive table: 166, 160, 150, 162, 160, 165, 160, 167, 164, 160, 162, 161, 168, 163, 156, 173, 160, 155, 164, 168, 155, 152, 163, 160, 155, 155, 159, 151, 170, 164, 154, 161, 156, 172, 153, 157, 156, 158, 158, 161.

It’s difficult to interpret this data effectively. Organizing the data, such as through ascending or descending sorting, creates a roll, which makes analysis easier.

Sorted Data (in cm): 150, 151, 152, 153, 154, 155, 155, 155, 156, 156, 156, 157, 158, 158, 159, 160, 160, 160, 160, 161, 161, 161, 162, 162, 163, 163, 164, 164, 164, 165, 166, 167, 168, 168, 170, 172, 173.

Now, we can easily identify the smallest height (150 cm) and the tallest (173 cm). The range of variation is 23 cm (173 – 150). We can also observe a concentration of heights between 160 cm and 165 cm.

Frequency Distribution

A frequency distribution table further simplifies data analysis by listing each height value and the number of times it appears (its frequency).

Table 5.3: Frequency Distribution of Student Heights

150 – 1; 151 – 1; 152 – 1; 153 – 1; 154 – 1; 155 – 4; 156 – 3; 157 – 1; 158 – 2; 160 – 5; 161 – 4; 162 – 2; 163 – 2; 164 – 3; 165 – 1; 166 – 1; 167 – 1; 168 – 2; 169 – 1; 170 – 1; 172 – 1; 173 – 1; Total: 40

To make this more concise, we can group height values into class intervals (or classes).

Table 5.4: Frequency Distribution with Class Intervals

Height of 40 Students

  • Class 1 (150-154 cm): 4
  • Class 2 (154-158 cm): 9
  • Class 3 (158-162 cm): 11
  • Class 4 (162-166 cm): 8
  • Class 5 (166-170 cm): 5
  • Class 6 (170-174 cm): 3
  • Total: 40

While grouping simplifies the table, we lose some detail. For example, we can no longer see the exact number of students with a height of 161 cm, but we know that 11 students have heights between 158 cm and 162 cm.

Class Intervals and Frequency

5.3 Class Definitions

Classes are represented by ‘i’, where ‘i’ = 1, 2, 3,…K, and ‘K’ is the total number of classes.

Example 1: In Table 5.4, there are 6 classes. The range of the 3rd class is 158-162 cm.

5.3.2 Class Limits

Class boundaries are defined by the extremes of each class. The lower boundary is the lower limit (li), and the upper boundary is the upper limit (LI).

Example 2: In Table 5.4, for the 5th class, the lower limit (l5) is 166 cm, and the upper limit (L5) is 170 cm.

5.3.3 Class Interval Amplitude

The amplitude (Hi) of a class interval is the difference between its upper and lower limits: Hi = LI – li.

Example 3: For the 5th class in Table 5.4, the amplitude is h5 = 170 – 166 = 4 cm.

5.3.4 Total Distribution Range

The total distribution range (AT) is the difference between the upper limit of the last class (Lmax) and the lower limit of the first class (lmin): AT = Lmax – lmin.

Example: AT = 174 – 150 = 24 cm.

5.3.5 Sample Range

The sample range (AA) is the difference between the maximum and minimum values in the sample: AA = Xmax – Xmin.

Example: AA = 173 – 150 = 23 cm.

5.3.6 Class Midpoint

The midpoint (Xi) of a class is the average of its lower and upper limits: Xi = (li + LI) / 2.

Example 5: The midpoint of the 2nd class in Table 5.4 is X2 = (154 + 158) / 2 = 156 cm.

5.3.7 Simple Absolute Frequency

The simple absolute frequency (fi) is the number of observations in a class. In our example, f1 = 4, f2 = 9, f3 = 11, f4 = 8, f5 = 5, and f6 = 3. The sum of all frequencies is Σfi = 40.

5.4 Number of Classes and Ranges

Sturges’ rule can help determine the number of classes: i = 1 + 3.3 * log(n), where ‘n’ is the total number of data points. However, this is a guideline, and the final decision depends on the specific data.

In our example (n = 40), Sturges’ rule suggests i = 6 classes. With a sample range of 23 cm, an interval of 4 cm for each class is reasonable.

5.5 Types of Frequencies

5.5.1 Simple or Absolute Frequency (fi)

This is the actual number of data points in each class (as shown in Table 5.4).

5.5.2 Relative Frequency (fri)

This is the ratio of the simple frequency of a class to the total frequency: fri = fi / Σfi.

5.5.3 Cumulative Frequency (Fi)

This is the sum of the frequencies of all classes up to a given class.

5.5.4 Relative Cumulative Frequency (Fri)

This is the ratio of the cumulative frequency of a class to the total frequency: Fri = Fi / Σfi.

Exercises and Examples

The provided exercises and examples demonstrate calculations for relative frequency, cumulative frequency, and relative cumulative frequency. They also include questions about interpreting frequency distributions.