Coupon Accepted Successfully!


Regression Analysis

The statistical technique that expresses a functional relationship between two or more variables in the form of an equation to estimate the value of a variable, based on the given value of another variable is called regression analysis.


The variable whose value is to be estimated is called dependent variable and the variable whose value is used to estimate this value is called independent variable.


The linear algebraic equations that express a dependent variable in terms of an independent variable are called linear regression equation.


For example, if the sales and advertising expenses for a product are correlated, then by regression analysis the dependent variable (sales in this case) can be estimated for a given value of the independent variable (Advertising expenses in this case).


For a bivariate data y on x, the regression equation obtained with the assumption that y is dependent on x is called regression of y on x and it is given by


y = a + bx


For a bivariate data x on y, the regression equation obtained with the assumption that x is dependent on y is called regression of x on y and it is given by


x = a + by


We can clearly see that, the above regression equations are similar to the equation of a straight line.


Keeping this similarity in mind, ‘a’ will be the intercept and ‘b’ will be the slope of the line represented by the equations. The values of ‘a’ and ‘b’ can be found by solving the following pair of equations simultaneously.






n is the number of (xy) pairs


x is the sum of all x values


y is the sum of all y values


xy is the sum of the product of x and corresponding y values


y2 is the sum of squares of y values


x2 is the sum of squares of x values


The regression equation of x on y can also be expressed as follows:


and that of y on x can be expressed as,





the constants bxy and byx are called as the regression coefficients


 are arithmetic means of x and y values respectively.


bxy and byx can be calculated using the expressions given below:


Find the regression equation for the following data and estimate Y when X = 13.
X 2 4 5 5 8 10
Y 6 7 9 10 12 12
Since regression coefficients are independent of the origins, the required regression coefficient is as follows:
Thus, regression of y on x is
On substitution, we get
This is the regression equation of y on x.
To estimate the value of y, when X = 13 in the above equation, the value
X = 13 is substituted.
Thus, the estimate of son’s height is
Y = 0.8145 × 13 + 4.7178 = 15.3063.


A survey of children revealed the following information regarding IQ of child (x) and age of mother at the time of giving birth to child (y) in years.
  x y
Mean 98 28 years
Standard deviation 2 4 years
Coefficient of correlation rxy = -0.24
Estimate the IQ of a child whose mother was aged 47 years at the time of giving birth to the child.
Let us assume that IQ depends on the age of mother. So for the estimation of x for the given y, the regression equation of x on y is given below as:
Hence, the regression equation of x on y is
Estimate of IQ of a child when y = 47 is
x = -0.12(47) + 101.36 = 95.96.


Note: The regression equation of x on y is used for the estimation of x values and the regression equation of y on x is used for the estimation of y values.

Properties of Regression Coefficient

  • The product of both the regression coefficient gives us the coefficient of correlation r2 = byx × bxy
  • where, r is the coefficient of correlation, byx and bxy are the regression coefficients
  • Since coefficient of correlation, numerically, cannot be greater than 1, the product of regression coefficients cannot be greater than 1
  • Regression coefficients will have same sign as that of r
  • The average value of the two regression coefficients would be greater than the value of coefficient of correlation
  • Regression coefficients are independent of change of origin, but not scale

Properties of Regression Lines

  • The two lines of regression intersect at the average values of x and y
  • If there is perfect correlation r = ±1, the regression lines coincide
  • The angle A between the regression lines is given by, tan
  • Regression lines are perpendicular to each other, when r = 0

Test Your Skills Now!
Take a Quiz now
Reviewer Name