# Multiple Regression | What is it and How is it Useful in Statistics?

Multiple regression is one of the major aspects of statistics. It is playing a crucial role in statistics and mathematics. Here we are going to explain everything about it. let’s discuss it:-

Table of Contents

**What is regression?**

Before understanding what **multiple regression** is all about, knowing a bit about regression does make sense. It is one of those statistical methods used to determine the relationship between one dependent variable and other independent variables. In this case, the dependent variable is usually denoted by “Y”. Regression finds wide application in the field of finance and economics.

**Types of regression**

It is mainly categorized into –

- Simple regression
**Multiple regression**

Simple regression makes use of only a single independent variable to predict the outcome of the dependent variable ‘Y’, but in the case of the latter, multiple independent variables (two or more) in order to predict the outcome of “Y”.

Regression, be it simple or multiple takes the form of an equation.

The general form of each type of regression is:

- Simple linear regression: Y = a + bX + u
- Multiple linear regression: Y = a + b1X1 + b2X2 + b3X3 + … + btXt + u

Where,

Y is the dependent variable, the one which is predicted.

X is the independent variable that is used to predict Y

a is the intercept

b is the slope and

u is the regression residual.

**What is multiple regression?**

One would be able to find two terms – Multiple linear regression (MLR) and **multiple regression** used quite often when talking about it. Both these terms mean the same. It is one such statistical technique that employs two or more explanatory variables (independent) in order to predict the outcome of one response variable (dependent).

**When can it be used?**

Having understood what exactly **multiple regression** is, it is now time to know when and where can it be used.

- When the user wants to determine the value of a dependent variable at particular values of independent variables. For example, how will the tax collected by the Government change if there’s a change in the income tax slabs, GST policies, etc.?
- When the strength of the relationship between the independent and dependent variables is to be established and measured. In simple words, it helps to determine the strength of the effect that the independent variables have on a dependent variable. For example, how will the rains affect the yield of the crops?
- It can also be used to predict the trends and the patterns followed, hence the future values.

**Assumptions**

Certain assumptions back **multiple regression**. There are four major assumptions involved in the process. They are –

**Linearity**

It requires that the relationship between the independent and dependent variables needs to be linear. To test this assumption of linearity, one can do so with scatterplots. Though there are many options available to check for the required linear relationship, ‘scatterplots’ and ‘partial regression plot’ are the best ways to go. Using any of these techniques, if the relationship obtained is non-linear, then one can run the non-linear regression analysis or transform the data.

**Homogeneity of variance (homoscedasticity)**

This is nothing but assuming that different samples have the same variance even if they came from different populations. In other words, the error doesn’t change significantly across the values of the independent variables.

**Independence of observations**

This assumption talks about the observations in the sample being independent of each other. Simply put, the measurements for one model are not influenced by or related to other sample subjects’ measurements.

**The assumption of normality**

This states that the data follows a normal distribution.

In addition to the above, there are few other assumptions like the absence of multicollinearity, the residuals are homoscedastic, and they are approximately rectangular-shaped, to talk about a few.

**Steps**

Now that everything is in place let’s look at the steps to be followed while performing multiple regression analysis.

- The first step undoubtedly is to collect the data. This data collection is not limited to just one source. One can make use of multiple sources for data collection. After collecting the data, it has to be checked for typos, anomalies, etc.
- After having collected and prepared the data, the next step is to select the variables.
**I**t involves two or more independent variables. Thus, close attention has to be paid during selection. Ultimately, the outcome depends on the variables selected. - Next comes the fun part – running the program. The process majorly involves entering the variables, choosing report options, entering alpha (constant or the intercept) and lastly, selecting the plots.
- After running the program, one will get to see an output of the same. It is here that the conditions for linearity, normality, constant variance, multicollinearity, etc., can be checked for.
- The last step would be to record the results.

**How to carry out multiple regression on SPSS?**

SPSS is the world’s leading software that works best for many statistical analysis techniques like regression analysis.

To carry out **multiple regression** on SPSS, the following steps have to be followed.

- On opening this software, one will be able to find the main menu. The first step would be to create the variable required for the analysis.
- After doing so, select the option of ‘Analyze’. Under this, select regression followed by linear.
- This will open up a linear regression dialogue box. This is where the variables have to be assigned to the respective category, dependent and independent.
- After this, click on the ‘statistics’ option. A dialogue box will pop up. Here, one is required to enter the confidence level.
- Now that all the details are filled in click on the ‘continue’ option. With this, the linear regression box is displayed.
- Lastly, click on ‘OK’ to generate the output.

**Applications of multiple regression**

Some of the most common applications are –

- Predicting the company’s sales based on previous sales, government policies, GDP growth rate, inflation, etc.
- Used by the finance managers in arriving at the cost of capital
- Pricing the assets and a lot more.

## Let’s Sum Up

No wonder it serves us in a variety of ways. But, **multiple regression** has a lot more applications than simple ones. Using the right techniques and proper data collection methods can help predict outcomes that ultimately pave the way for better decisions ahead.