Types Of  Data

  • Tabulation
  • Simple bar chart
  • Component bar chart
  • Multiple bar chart
  • Pie chart

As indicated in the last lecture, there are two broad categories of data … qualitative data and quantitative data. A variety of methods exist for summarizing and describing these two types of data. The treediagram below presents an outline of the various techniques

statistics and probability  Types Of  Data

statistics and probability  Types Of  Data

In today’s lecture, we will be dealing with various techniques for summarizing and describing qualitative data.

statistics and probability  Types Of  Data

We will begin with the univariate situation, and will proceed to the bivariate situation.

EXAMPLE

Suppose that we are carrying out a survey of the students of first year studying in a coeducational college of Lahore. Suppose that in all there are 1200 students of first year in this large college. We wish to determine what proportion of these students have come from Urdu medium schools and what proportion has come from English medium schools. So we will interview the students and we will inquire from each one of them about their schooling. As a result, we will obtain a set of data as you can now see on the screen. We will have an array of observations as follows:

U,U, E, U,E,E, E,U, ……

(U : URDU MEDIUM)

(E : ENGLISH MEDIUM)

Now, the question is what should we do with this data? Obviously, the first thing that comes to mind is to count the number of students who said “Urdu medium” as well as the number of students who said “English medium”. This will result in the following table:

Medium of No. of Students
Institution (f)
Urdu 719
English 481
1200

The technical term for the numbers given in the second column of this table is “frequency”. It means “how frequently something happens?” Out of the 1200 students, 719 stated that they had come from Urdu medium schools. So in this example, the frequency of the first category of responses is 719 whereas the frequency of the second category of responses is 481.

It is evident that this information is not as useful as if we compute the proportion or percentage of students falling in each category. Dividing the cell frequencies by the total frequency and multiplying by 100 we obtain the following:

Medium of Institution f %
Urdu 719 59.9 = 60%
English 481 40.1 = 40%
1200

What we have just accomplished is an example of a univariate frequency table pertaining to qualitative data. Let us now see how we can represent this information in the form of a diagram. One good way of representing the above information is in the form of a pie chart. A pie chart consists of a circle which is divided into two or more parts in accordance with the number of distinct categories that we have in our data. For the example that we have just considered, the circle is divided into two sectors, the larger sector pertaining to students coming from Urdu medium schools and the smaller sector pertaining to students coming from English medium schools. How do we decide where to cut the circle? The answer is very simple! All we have to do is to divide the cell frequency by the total frequency and multiply by 360. This process will give us the exact value of the angle at which we should cut the circle.

PIE CHART SIMPLE BAR CHART:

Medium of Institution f Angle
Urdu 719 215.70
English 481 144.30
1200

statistics and probability  Types Of  Data

The next diagram to be considered is the simple bar chart.

A simple bar chart consists of horizontal or vertical bars of equal width and lengths proportional to values they represent.

As the basis of comparison is onedimensional, the widths of these bars have no mathematical significance but are taken in order to make the chart look attractive. Let us consider an example.

Suppose we have available to us information regarding the turnover of a company for 5 years as given in the table below:

Years 1965 1966 1967 1968 1969
Turnover (Rupees) 35,000 42,000 43,500 48,000 48,500

In order to represent the above information in the form of a bar chart, all we have to do is to take the year along the xaxis and construct a scale for turnover along the yaxis.

50,000

40,000

30,000

20,000

10,000

Next, against each year, we will draw vertical bars of equal width and different heights in accordance with the turnover figures that we have in our table.

As a result we obtain a simple and attractive diagram as shown below. When our values do not relate to time, they should be arranged in ascending or descending order beforecharting.

BIVARIATE FREQUENCY TABLE

50,000

40,000

30,000

20,000

10,000

What we have just considered was the univariate situation. In each of the two examples, we were dealing with one single variable. In the example of the first year students of a college, our lone variable of interest was ‘medium of schooling’. And in the second example, our one single variable of interest was turnover. Now let us expand the discussion a little, and consider the bivariate situation.

statistics and probability  Types Of  Datastatistics and probability  Types Of  Data

Going back to the example of the first year students, suppose that alongwith the enquiry about the Medium of Institution, you are also recording the sex of the student. Suppose that our survey results in the following information:

Student No. Medium Gender
1 U F
2 U M
3 E M
4 U F
5 E M
6 E F
7 U M
8 E M
: : :
: : :

Now this is a bivariate situation; we have two variables, medium of schooling and sex of the student. In order to summarize the above information, we will construct a table containing a box head and a stub as shown below:

Sex Med. M A L E Female Total
Urdu
English
Total

The top row of this kind of a table is known as the boxhead and the first column of the table is known as stub. Next, we will count the number of students falling in each of the following four categories:

  1. Male student coming from an Urdu medium school.
  2. Female student coming from an Urdu medium school.
  3. Male student coming from an English medium school.
  4. Female student coming from an English medium school.

As a result, suppose we obtain the following figures:

Sex Med. M A L E Female Total
Urdu 202 517 719
English 350 131 481
Total 552 648 1200

What we have just accomplished is an example of a bivariate frequency table pertaining to two qualitative variables.

COMPONENT BAR CHAR:

Let us now consider how we will depict the above information diagrammatically. This can be accomplished by constructing the component bar chart (also known as the subdivided bar chart) as shown below:

statistics and probability  Types Of  Data

Male Female

In the above figure, each bar has been divided into two parts. The first bar represents the total number of male students whereas the second bar represents the total number of female students.

As far as the medium of schooling is concerned, the lower part of each bar represents the students coming from English medium schools. Whereas the upper part of each bar represents the students coming from the Urdu medium schools.The advantage of this kind of a diagram is that we are able to ascertain the situation of both the variables at a glance.

We can compare the number of male students in the college with the number of female students, and at the same time we can compare the number of English medium students among the males with the number of English medium students among the females.

MULTIPLE BAR CHARTS

The next diagram to be considered is the multiple bar charts. Let us consider an example. Suppose we have information regarding the imports and exports of Pakistan for the years 197071 to 197475 as shown in the table below:

Years Imports (Crores of Rs.) Exports (Crores of Rs.)
197071 370 200
197172 350 337
197273 840 855
197374 1438 1016
197475 2092 1029

Source: State Bank of Pakistan

A multiple bar chart is a very useful and effective way of presenting this kind of information.

This kind of a chart consists of a set of grouped bars, the lengths of which are proportionate to the values of our variables, and each of which is shaded or coloured differently in order to aid identification. With reference to the above example, we obtain the multiple bar chart shown below:

Multiple Bar Chart Showing Imports & Exports of Pakistan 197071 to 197475

This is a very good device for the comparison of two different kinds of information.

statistics and probability  Types Of  Data

If, in addition to information regarding imports and exports, we also had information regarding production, we could have compared them from year to year by grouping the three bars together. The question is, what is the basic difference between a component bar chart and a multiple bar chart? The component bar chart should be used when we have available to us information regarding totals and their components.

For example, the total number of male students out of which some are Urdu medium and some are English medium. The number of Urdu medium male students and the number of English medium male students add up to give us the total number of male students. On the contrary, in the example of exports and imports, the imports and exports do not add up to give us the totality of some one thing!

VN:F [1.9.14_1148]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.14_1148]
Rating: 0 (from 0 votes)