Loading....
Coupon Accepted Successfully!

 

Raw Data

Like the dealer's junk, the unclassified data or raw data are disorganised completely. They are often very large and difficult to handle. To get meaningful conclusions from them is a difficult task because they do not yield to statistical methods easily. Therefore proper organisation and presentation of such data is needed before any systematic statistical analysis is undertaken. Therefore after collecting data, the next step is to organise and present them in a classified form. Suppose you want to know the performance of students in mathematics and you have collected data on marks in mathematics of 100 students of your school. If you form this data as a table, they may appear something like Table 3.1.

TABLE 3.1
Marks in Mathematics Obtained by 100 Students in an Examination

47 45 10 60 51 56 66 100 49 40
60 59 56 55 62 48 59 55 51 41
42 69 64 66 50 59 57 65 62 50
64 30 37 75 17 56 20 14 55 90
62 51 55 14 25 34 90 49 56 54
70 47 49 82 40 82 60 85 65 66
49 44 64 69 70 48 12 28 55 65
49 40 25 41 71 80 0 56 14 22
66 53 46 70 43 61 59 12 30 35
45 44 57 76 82 39 32 14 90 25

Or you could have collected data on the monthly expenditure on food of 50 households in your neighbourhood to know their average expenditure on food. The data collected, in that case, had you presented as a table, would have resembled Table 3.2. Both Tables 3.1 and 3.2 are raw or unclassified data. In both the tables you find that numbers are not arranged in any order. Now if you are asked what are the highest marks in mathematics from Table 3.1

TABLE 3.2
Monthly Household Expenditure (in Rupees) on Food of 50 Households

1824 2559 3465 1465 2950
2401 1632 2453 1894 3438
4090 1809 2725 4446 1432
3211 1376 1118 2142 1147
1438 1616 1255 2638 4512
4228 1818 1452 1137 1711
1457 1230 1900 1657 3148
2025 1583 1324 2621 3326
1346 1942 1662 2173 2755
1272 2465 3446 2222 1456

Then you have to first arrange the marks of 100 students either in ascending or in descending order. That is a tedious task. It becomes more tedious, if instead of you have the marks of a 1,000 students to handle. Similarly in Table 3.2, you would note that it is difficult for you to ascertain the average monthly expenditure of 50 households. And this difficulty will goup manifold if the number was larger — say, 5,000 households. Like our junk dealer, who would be distressed to find a particular item when his junk becomes large and not arranged properly, you would face a similar situation when you want to get any information from raw data that are large. In one word, therefore, it is very hard to pull information from large unclassified data. The raw data are summarised, and made comprehensible by classification. When facts of similar characteristics are placed in the same class, it enables one to locate them easily, make comparison, and draw inferences without any difficulty. You have studied in Chapter 2 that the Government of India conducts Census of population every ten years. The raw data of census are so large and fragmented that it appears an almost impossible task to draw any meaningful conclusion from them. But when the data of Census are classified according to gender, education, marital status, occupation, etc., the structure and nature of population of India is, then, easily understood. The raw data consist of observations on variables. Each unit of raw data is an observation.

In Table 3.1 an observation shows a particular value of the variable “marks of a student in mathematics”. The raw data contain 100 observations on “marks of a student” since there are 100 students. In Table 3.2 it shows a particular value of the variable “monthly expenditure of a household on food”. The raw data in it contain 50 observations on “Monthly Expenditure on food of a household” because there are 50 households.




Test Your Skills Now!
Take a Quiz now
Reviewer Name