# Scatter Diagram

Scatter diagram is a graphical representation of bivariate data. It is also known as Scatter plot. A graph of pair of values of two variables that is plotted to indicate a visual display of the pattern of their relationship.

Here, bivariate data with *n* pairs of values are represented by *n* points on the *xy* plane.

The pattern of distribution of points on the graph reveals the nature of correlation.

If the plotted points lie from lower left corner to upper right corner, then it is a positive correlation (refer Fig. 1(a)). In this case the value of *r* is between 0 and 1, *i.e.*, 0 < *r* < 1.

If the plotted points lie on a straight line from the lower left corner to upper right corner, then it is a perfect positive correlation (Refer Fig 1(b)). In this case, the value of *r* is unity, *i.e.*, *r* = 1.

If the plotted points lie from the upper left corner to lower right corner, then it is negative correlation (refer Fig. 2a). In this case the value of r is between â€“1 and 0, i.e., -1 < r < 0.

If the plotted points lie on a straight line from the upper left corner to lower right corner, then it is a perfect negative correlation (Refer Fig 2b). In this case, the value of r is minus unity. i.e., r = -1.

If the points are equally distributed without showing any patterns, then there is zero correlation (Refer Fig. 3).

*Merits of scatter diagram*

- It is a very simple method of studying correlation between the variables
- The pattern of distribution of points on the graph gives us an estimation of degree of correlation between the variables

*Demerits of scatter diagram*

- By this method, we do not get the exact magnitude of correlation between the variables. It is only a very rough measure.

# Coefficient of Concurrent Deviation

This method follows the principle that if the short term fluctuations of two series are correlated, then the deviations would be concurrent and their curves move in the same direction. This method involves assigning a positive sign for the â€˜*x*â€™ value (except the first) if this value is more than the previous value and assigning a negative sign, if this value is less than the previous value. This is done for the

*y*-series as well. The deviation in the

*x*-value and the corresponding

*y*-value is known to be concurrent if both the deviations have the same sign.

The coefficient of concurrent deviation is given by,

where,

*c *= number of pairs of positive concurrent deviations

*m* = *n* - 1 (*n* is the total number of data pairs)

*Very important points*

- If (2c - m) is positive, we take the positive sign of R
- If (2c - m) is negative, then we take negative sign of R

Merits

- This method gives us a quick idea about the degree of relationship between the observations, when the data values are more in number
- Easy to calculate as compared to Karl pierson's method

*Demerits*

- The magnitude of values is not taken into consideration
- The result obtained just indicates approximately if there is any correlation or not