What types of graphs are used to represent the frequencies of continuous data that are arranged into class intervals?

  • Example 1: There are 20 students in a class. The teacher, Ms. Jolly, asked the students to tell their favorite subject. The results are as follows - Mathematics, English, Science, Science, Mathematics, Science, English, Art, Mathematics, Mathematics, Science, Art, Art, Science, Mathematics, Art, Mathematics, English, English, Mathematics.

    Represent this data in the form of frequency distribution and identify the most-liked subject?

    Solution: 20 students have indicated their choices of preferred subjects. Let us represent this data using tally marks. The tally marks are showing the frequency of each subject.

    According to the above frequency distribution, mathematics is the most liked subject.

  • Example 2: 100 schools decided to plant 100 tree saplings in their gardens on world environment day. Represent the given data in the form of frequency distribution and find the number of schools that are able to plant 50% of the plants or more?
    95, 67, 28, 32, 65, 65, 69, 33, 98, 96, 76, 42, 32, 38, 42, 40, 40, 69, 95, 92, 75, 83, 76, 83, 85, 62, 37, 65, 63, 42, 89, 65, 73, 81, 49, 52, 64, 76, 83, 92, 93, 68, 52, 79, 81, 83, 59, 82, 75, 82, 86, 90, 44, 62, 31, 36, 38, 42, 39, 83, 87, 56, 58, 23, 35, 76, 83, 85, 30, 68, 69, 83, 86, 43, 45, 39, 83, 75, 66, 83, 92, 75, 89, 66, 91, 27, 88, 89, 93, 42, 53, 69, 90, 55, 66, 49, 52, 83, 34, 36

    Solution: To include all the observations in groups, we will create various groups of equal intervals. These intervals are called class intervals. In the frequency distribution, the number of plants survived is showing the class intervals, tally marks are showing frequency, and the number of schools is the frequency in numbers.

    So, according to class intervals starting from 50 – 59 to 90 – 99, the frequency of schools able to retain 50% or more plants are 8 + 18 + 10 + 23 + 12 = 71 schools. Thus, 71 schools are able to retain 50% or more plants in their garden.

  • A frequency table is simply a “t-chart” or two-column table which outlines the various possible outcomes and the associated frequencies observed in a sample.

    From: The Joy of Finite Mathematics, 2016

    Assistant Editor, JPP

    Find articles by S Manikandan

    Author information Copyright and License information Disclaimer

    Copyright © Journal of Pharmacology and Pharmacotherapeutics

    This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

    The next step after the completion of data collection is to organize the data into a meaningful form so that a trend, if any, emerging out of the data can be seen easily. One of the common methods for organizing data is to construct frequency distribution. Frequency distribution is an organized tabulation/graphical representation of the number of individuals in each category on the scale of measurement.[1] It allows the researcher to have a glance at the entire data conveniently. It shows whether the observations are high or low and also whether they are concentrated in one area or spread out across the entire scale. Thus, frequency distribution presents a picture of how the individual observations are distributed in the measurement scale.

    A frequency (distribution) table shows the different measurement categories and the number of observations in each category. Before constructing a frequency table, one should have an idea about the range (minimum and maximum values). The range is divided into arbitrary intervals called “class interval.” If the class intervals are too many, then there will be no reduction in the bulkiness of data and minor deviations also become noticeable. On the other hand, if they are very few, then the shape of the distribution itself cannot be determined. Generally, 6–14 intervals are adequate.[2]

    The width of the class can be determined by dividing the range of observations by the number of classes. The following are some guidelines regarding class widths:[1]

    • It is advisable to have equal class widths. Unequal class widths should be used only when large gaps exist in data.

    • The class intervals should be mutually exclusive and nonoverlapping.

    • Open-ended classes at the lower and upper side (e.g., <10, >100) should be avoided.

    The frequency distribution table of the resting pulse rate in healthy individuals is given in Table 1. It also gives the cumulative and relative frequency that helps to interpret the data more easily.

    Frequency distribution of the resting pulse rate in healthy volunteers (N = 63)

    Pulse/minFrequencyCumulative frequencyRelative cumulative frequency (%)
    60–64223.17
    65–697914.29
    70–74112031.75
    75–79153555.56
    80–84104571.43
    85–8995485.71
    90–9466095.24
    95–99363100

    A frequency distribution graph is a diagrammatic illustration of the information in the frequency table.

    Histogram

    A histogram is a graphical representation of the variable of interest in the X axis and the number of observations (frequency) in the Y axis. Percentages can be used if the objective is to compare two histograms having different number of subjects. A histogram is used to depict the frequency when data are measured on an interval or a ratio scale. Figure 1 depicts a histogram constructed for the data given in Table 1.

    A bar diagram and a histogram may look the same but there are three important differences between them:[3,4]

    In a histogram, there is no gap between the bars as the variable is continuous. A bar diagram will have space between the bars.

    All the bars need not be of equal width in a histogram (depends on the class interval), whereas they are equal in a bar diagram.

    The area of each bar corresponds to the frequency in a histogram whereas in a bar diagram, it is the height [Figure 1].

    Frequency polygon

    A frequency polygon is constructed by connecting all midpoints of the top of the bars in a histogram by a straight line without displaying the bars. A frequency polygon aids in the easy comparison of two frequency distributions. When the total frequency is large and the class intervals are narrow, the frequency polygon becomes a smooth curve known as the frequency curve. A frequency polygon illustrating the data in Table 1 is shown in Figure 2.

    Box and whisker plot

    This graph, first described by Tukey in 1977, can also be used to illustrate the distribution of data. There is a vertical or horizontal rectangle (box), the ends of which correspond to the upper and lower quartiles (75th and 25th percentile, respectively). Hence the middle 50% of observations are represented by the box. The length of the box indicates the variability of the data. The line inside the box denotes the median (sometimes marked as a plus sign). The position of the median indicates whether the data are skewed or not. If the median is closer to the upper quartile, then they are negatively skewed and if it is near the lower quartile, then positively skewed.

    The lines outside the box on either side are known as whiskers [Figure 3]. These whiskers are 1.5 times the length of the box, i.e., the interquartile range (IQR). The end of whiskers is called the inner fence and any value outside it is an outlier. If the distribution is symmetrical, then the whiskers are of equal length. If the data are sparse on one side, the corresponding side whisker will be short. The outer fence (usually not marked) is at a distance of three times the IQR on either side of the box. The reason behind having the inner and outer fence at 1.5 and 3 times the IQR, respectively, is the fact that 95% of observations fall within 1.5 times the IQR, and it is 99% for 3 times the IQR.[5]

    There are four important characteristics of frequency distribution.[6] They are as follows:

    • Measures of central tendency and location (mean, median, mode)

    • Measures of dispersion (range, variance, standard deviation)

    • The extent of symmetry/asymmetry (skewness)

    • The flatness or peakedness (kurtosis).

    These will be dealt with in detail in the next issue.

    Source of Support: Nil

    Conflict of Interest: None declared

    1. Gravetter FJ, Wallnau LB. 5th ed. Belmont: Wadsworth – Thomson Learning; 2000. Statistics for the behavioral sciences. [Google Scholar]

    2. Dawson B, Trapp RG. 4th ed. New York: McGraw Hill; 2004. Basic and clinical biostatistics. [Google Scholar]

    3. Sundaram KR, Dwivedi SN, Sreenivas V. 1st ed. New Delhi: B.I Publications Pvt Ltd; 2010. Medical statistics principles and methods. [Google Scholar]

    4. Swinscow TDV, Campbell MJ. (Indian) 10th ed. New Delhi: Viva Books Private Limited; 2003. Statistics at square one.10 th ed (Indian) [Google Scholar]

    5. Norman GR, Streiner DL. 2nd ed. Hamilton: B.C. Decker Inc; 2000. Biostatistics the bare essentials. [Google Scholar]

    6. Sundar Rao PS, Richard J. 4th ed. New Delhi: Prentice Hall of India Pvt Ltd; 2006. Introduction to biostatistics and research methods. [Google Scholar]

    Articles from Journal of Pharmacology & Pharmacotherapeutics are provided here courtesy of Wolters Kluwer -- Medknow Publications

    Toplist

    Latest post

    TAGs