# Summary

- Statistics when used as a singular sense may be defined as the science of collection, presentation, analysis and interpretation of numerical data
- In a plural sense, by statistics we mean data relating to any phenomenon that is numerically expressed, enumerated or estimated according to a reasonable standard of accuracy collected in a systematic manner for a pre-determined purpose and placed in relation to each other
- Data is the quantitative or qualitative information about some particular characteristic(s) under consideration
- Quantitative characteristic is known as a variable
- Qualitative characteristic is known as an attribute
- If the variable assumes finite or a countable infinite number of isolated values, it is known to be discrete variable
- If the variable assumes any value in a given interval, it is known as continuous variable
- Collection of data is the method involved in collecting data through censuses and surveys or in a routine manner or through other sources
- The data which is collected for the first time is known as primary data
- If already collected data is used for the statistical analysis, it is known as secondary data
- Methods of collecting primary data
- Interview method
- Personal interview method
- Indirect interview method
- Mailed questionnaire method
- Direct observation method

- Telephonic method
- Information collected through newspapers and periodicals
- International sources like WHO, IMF, World Bank
- Government sources like CSO, ICAI, etc.

- Checking of the data collected for completeness, accuracy and reliability is known as scrutiny of data.
- Scrutiny of data means checking of data with the help of related series
- Classification of data can be categorized as:
- Chronological (temporal) classification
- Qualitative (ordinal) classification
- Geographical (spatial) classification
- Quantitative (cardinal) classification

â€‹Modes of presentation of data - Textual presentation
- Tabular presentation or tabulation
- Diagrammatic representation of data
- Graphical representation of data

- Textual presentation is not preferred by statisticians at it is dull, monotonous and comparison of data is not possible
- Attractive representation of statistical data is provided by charts, diagrams and pictures
- Hidden trend if any in data can be easily noticed by diagrammatic representation
- When time series exhibit wide range of fluctuations we use logarithmic or ration chart for data analysis
- Multiple line charts are used for representing two or more related series expressed in same unit
- Multiple axis charts are used for representing two or more related time series expressed in different unit
- Horizontal bar diagram: to represent qualitative data or data varying over space
- Vertical bar diagram: to represent quantitative data or time series data
- A systematic arrangement which shows how the total frequency is distributed among the different values of variables is called frequency distribution.
- Types of frequency distribution
- Discrete frequency distribution: A frequency distribution with discrete variable is called discrete frequency distribution
- Continuous frequency distribution: A frequency distribution with continuous variable is called continuous frequency distribution
- The two end values of class intervals are called class limits
- Class boundaries may be defined as the actual class limit of a class interval. For overlapping or mutually exclusive classification, the class boundaries coincide with the class limits. This is usually applicable for continuous variables
- For non-overlapping or mutually inclusive classification, which is usually applicable for a discrete variable, we have
- The difference between two successive mid points or the difference between class boundaries is called class width.
- Range of data is difference between the highest and lowest value of the data
- Class limit: Class limit is defined as the minimum value and maximum value the class interval may contain.
- For continuous type data: Class limit = Class boundary
- For discontinuous type data: Class limit â‰ class boundary
- Frequency distribution of a single variable is called uni-variate frequency distribution. Frequency distribution of more than one variable is called multi-variate frequency distribution
- Relative frequency may be defined as the ratio of the class frequency to the total frequency
- Frequency density may be defined as the ratio of the frequency of that class interval to the corresponding class length
- Graphical representation of a frequency distribution
- Histogram or area diagrams
- Frequency polygon
- Ogives or cumulative frequency graph
- Frequency curve

- Mode can be determined using a histogram
- If class width are of unequal width, frequency density is used to draw a histogram
- Frequency curve helps us in understanding the symmetry of the distribution
- Ogives are cumulative frequency curves

# Important Formulae:

- Midpoint/Mid value/Class mark
- Width/Size of class = U.C.B â€“ L.C.B
- Frequency density =
- Relative frequency =
- % frequency = Ã— 100