POLYNOMIAL REGRESSION
Linear
regression is the relation between the dependent variable and the independent
variable to be linear. If the distribution of the data becomes more complex
then the linear line won't best fit in the data. So for the line or a curve to best
fit the data we use polynomial regression.How can we generate a curve that best
captures the data?
The basic goal of
regression analysis is to model the expected value of a dependent variable y in
terms of the value of an independent variable x. In simple regression, we used
following equation
y=a + bx
Here y is dependent variable, a is
y intercept, b is the slope.
In many cases, this linear model
will not work out.The predicted results sometimes generate a value which is far from actual values which can lead to wrong information being conveyed.In such cases we use the following equation
y= a + b1x + b2x^2 +....+ bnx^n
y= a + b1x + b2x^2 +....+ bnx^n
Polynomial regression is a form of regression analysis in which the relationship between the independent variable x and the dependent variable y is modelled as an nth degree polynomial in x
.
Notice that Polynomial
Regression is very similar to Multiple LinearRegression, but consider at the
same time instead of the different variables the same one X1, but in different
powers; so basically we are using 1 variable to different powers of the same
original variable.
For example:Consider that there is a human resource team
working for a big company and about to hire a new employee in this company.So
the new employee seems to be a good fit for this job and now its time to
negotiate about his new salary for this
job.The employee is asking a salary to be above
160k annual salary since he has 20+ years of experience and his annual salary was
160k in his previous company. However there is someone in the hr team who wants
to know whether the details mentioned by the employee are true or not and that is why the HR retrieves the records of the previous company of different positions.With the help of applying
polynomial regression to the above dataset one can find out by building a bluffing detecter to predict whether the analysis
leads to truth or bluff i.e whether the new employee had mentioned the correct
salary or not.
To understand polynomial regression,lets generate dataset:
Fitting Linear regression to the dataset:
To implement
this in Python we are going to create a LR model first from sklearn.linearmodel
and call our object lin_reg
Fit polynomial regression to the dataset:
In this
object we are going to call poly_reg that will be our transforming tool which
will transform our matrix X in a new polynomial matrix where we can specify the
degree we want to run our model with:
Visualizing linear regression results:
We can see that the straight line is unable to
capture the patterns in the data.The line signifies the predicted values and
the data points are the actual values.The line doesn’t best fits the data and
there is huge difference between the actual salaries and predicted which can
give a wrong information to the HR.To overcome this we fit a polynomial
regression model.
Visualizing Polynomial regression results:
It is quite
clear from the plot that the quadratic curve is able to fit the data better
than the linear line.The curve best fits the data and the error between the predicted curve and actual curve has decreased which will help the HR to get correct information about the new employee salary which is approximately equal to 160k annual salary and hance the mentioned details by the employee is correct.
Thanks for Reading!
No comments:
Post a Comment