{"id":1656,"date":"2019-04-25T02:18:38","date_gmt":"2019-04-25T02:18:38","guid":{"rendered":"http:\/\/kusuaks7\/?p=1261"},"modified":"2023-07-28T04:20:53","modified_gmt":"2023-07-28T04:20:53","slug":"linear-regression","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/ai-ml\/linear-regression\/","title":{"rendered":"Linear Regression"},"content":{"rendered":"<p><strong>Linear regression is one of the most popular and best understood algorithms in the machine learning landscape. Since regression tasks belong to the most common machine learning problems in supervised learning, every Machine Learning Engineer should have a\u00a0thorough understanding of how it works. <\/strong><\/p>\n<p><strong>This blog post covers how the linear regression algorithm works, where it is used, how you can evaluate its performance and which tools &amp; techniques should be used along with it.<\/strong><\/p>\n<p>&nbsp;<\/p>\n<p><strong>Table of Contents:<\/strong><\/p>\n<ul>\n<li><b>Example of a use case<\/b><\/li>\n<li><b>How it works<\/b><\/li>\n<li><b>Cost Function<\/b><\/li>\n<li><b>Gradient Descent<\/b><\/li>\n<li><b>Feature Normalization<\/b><\/li>\n<li><b>Polynomial Regression<\/b><\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<p><b>Example of a Use Case:<\/b><\/p>\n<p>&nbsp;<\/p>\n<p style=\"text-align: center;\"><img fetchpriority=\"high\" decoding=\"async\" src=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/1.png?w=601&amp;h=228\" sizes=\"(max-width: 601px) 100vw, 601px\" srcset=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/1.png?w=601&amp;h=228 601w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/1.png?w=1202&amp;h=456 1202w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/1.png?w=150&amp;h=57 150w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/1.png?w=300&amp;h=114 300w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/1.png?w=768&amp;h=291 768w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/1.png?w=1024&amp;h=388 1024w\" alt=\"1.png\" width=\"601\" height=\"228\" data-attachment-id=\"331\" data-comments-opened=\"1\" data-image-description=\"\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"1\" data-large-file=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/1.png?w=601&amp;h=228?w=736\" data-medium-file=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/1.png?w=601&amp;h=228?w=300\" data-orig-file=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/1.png?w=601&amp;h=228\" data-orig-size=\"1588,602\" data-permalink=\"https:\/\/machinelearning-blog.com\/2018\/01\/24\/linear-regression\/attachment\/1\/\" \/><\/p>\n<p>A good example of the use of linear regression is the prediction of housing prices. We&#8217;re going to use housing prices dataset from the city of Portland, Oregon.\u00a0 Some houses with different sizes and prices\u00a0are plotted in the image above.<\/p>\n<p>Imagine a friend of you want\u2019s to sell his house of the size of 1250 square feet and he wants to know from you, for how much he can probably sell it for. One thing you could do is to fit a straight line to the data point in the graph, which is called regression.<\/p>\n<p>Based on this line you can see that he is maybe able to sell it for $ 220,000.<\/p>\n<p>Predicting housing prices is a regression problem because the term regression refers to the fact that we are prediction real valued output.<\/p>\n<p style=\"text-align: center;\"><img decoding=\"async\" src=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/2.png?w=616&amp;h=239\" sizes=\"(max-width: 616px) 100vw, 616px\" srcset=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/2.png?w=616&amp;h=239 616w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/2.png?w=1232&amp;h=478 1232w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/2.png?w=150&amp;h=58 150w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/2.png?w=300&amp;h=116 300w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/2.png?w=768&amp;h=298 768w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/2.png?w=1024&amp;h=397 1024w\" alt=\"2.png\" width=\"616\" height=\"239\" data-attachment-id=\"332\" data-comments-opened=\"1\" data-image-description=\"\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"2\" data-large-file=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/2.png?w=616&amp;h=239?w=736\" data-medium-file=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/2.png?w=616&amp;h=239?w=300\" data-orig-file=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/2.png?w=616&amp;h=239\" data-orig-size=\"1584,614\" data-permalink=\"https:\/\/machinelearning-blog.com\/2018\/01\/24\/linear-regression\/attachment\/2\/\" \/><\/p>\n<p>The other most common supervised learning problem is called classification, where we predict discrete valued outputs, like telling if a tumor is malignant (1) or benign (0). But we will discuss classification in another blog post.<\/p>\n<p>Usually, in supervised learning, we have a dataset, which is called the training set. If we go back to the housing prices example, we have a training set of housing prices, that contains examples of different houses, details about the houses and their prices. Our job is to learn from this data how to predict housing prices.<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p><b>How it works:<\/b><\/p>\n<p style=\"text-align: center;\"><img decoding=\"async\" src=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/4.png?w=487&amp;h=375\" sizes=\"(max-width: 487px) 100vw, 487px\" srcset=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/4.png?w=487&amp;h=375 487w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/4.png?w=150&amp;h=116 150w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/4.png?w=300&amp;h=231 300w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/4.png?w=768&amp;h=591 768w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/4.png 870w\" alt=\"4.png\" width=\"487\" height=\"375\" data-attachment-id=\"334\" data-comments-opened=\"1\" data-image-description=\"\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"4\" data-large-file=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/4.png?w=487&amp;h=375?w=736\" data-medium-file=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/4.png?w=487&amp;h=375?w=300\" data-orig-file=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/4.png?w=487&amp;h=375\" data-orig-size=\"870,670\" data-permalink=\"https:\/\/machinelearning-blog.com\/2018\/01\/24\/linear-regression\/attachment\/4\/\" \/><\/p>\n<p>We feed the training set into our learning algorithm, which then outputs a function (h), based on what he has learned from the training set. This function is called the \u201ehypothesis\u201c.<\/p>\n<p>The hypothesis is then able to estimate the price of a house, using its size as an input. This example is called \u201eunivariate\u201c regression, which means that you want to predict a single output value from a single input value. There is also \u201emultivariate\u201c regression, where you have several input values.<\/p>\n<p>The next thing we need to decide is how do we represent the hypothesis. There are different ways to represent it and I will use the one of Andrew Ng (for univariate regression):<\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/5.png?w=510&amp;h=114\" sizes=\"(max-width: 510px) 100vw, 510px\" srcset=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/5.png?w=510&amp;h=114 510w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/5.png?w=1017&amp;h=228 1017w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/5.png?w=150&amp;h=34 150w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/5.png?w=300&amp;h=67 300w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/5.png?w=768&amp;h=172 768w\" alt=\"5\" width=\"510\" height=\"114\" data-attachment-id=\"335\" data-comments-opened=\"1\" data-image-description=\"\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"5\" data-large-file=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/5.png?w=510&amp;h=114?w=736\" data-medium-file=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/5.png?w=510&amp;h=114?w=300\" data-orig-file=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/5.png?w=510&amp;h=114\" data-orig-size=\"1106,248\" data-permalink=\"https:\/\/machinelearning-blog.com\/2018\/01\/24\/linear-regression\/attachment\/5\/\" \/><\/p>\n<p>Note that this is like the equation of a straight line.<\/p>\n<p>&nbsp;<\/p>\n<p><b>Cost Function:<\/b><\/p>\n<p>After we\u2019ve trained our learning algorithm and got a hypothesis, we need to examine how good our results are. This is done by the so called cost function.<\/p>\n<p>The cost function measures the accuracy of the hypothesis outputs. It does this by comparing the predicted prices of the hypothesis with the actual prices of a house.<\/p>\n<p>To explain it more detailed: The cost function takes an average (actually a more complex version of an average) of all the hypothesis results with inputs compared to the actual prices of the houses. Therefore we want, that the output of the cost function is as small as possible because this means improving how accurate our predictions are.\u00a0The most used cost function in linear regression is the so-called \u201cMean Square Error (MSE)\u201d cost function.<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p><b>Gradient Descent:<\/b><\/p>\n<p>So we have a hypothesis and a technique to measure its accuracy, but how can we now actually improve the outputs of our hypothesis? We need to optimize the parameters \u03b80\u00a0and\u00a0\u03b81\u00a0of the hypothesis, which are later fed into the cost function.<\/p>\n<p>This is were gradient descent comes into play.<\/p>\n<p><b>Gradient descent is an optimization algorithm that tweaks it\u2019s parameters iteratively.<\/b><\/p>\n<p>But what does that actually mean?<\/p>\n<p>We already know that we want to find parameters that reduce the output of the cost function as good as possible. Heres an illustration of gradient descent:<\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/7.png?w=450&amp;h=298\" sizes=\"(max-width: 450px) 100vw, 450px\" srcset=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/7.png?w=450&amp;h=298 450w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/7.png?w=900&amp;h=596 900w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/7.png?w=150&amp;h=99 150w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/7.png?w=300&amp;h=199 300w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/7.png?w=768&amp;h=508 768w\" alt=\"7.png\" width=\"450\" height=\"298\" data-attachment-id=\"337\" data-comments-opened=\"1\" data-image-description=\"\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"7\" data-large-file=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/7.png?w=450&amp;h=298?w=736\" data-medium-file=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/7.png?w=450&amp;h=298?w=300\" data-orig-file=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/7.png?w=450&amp;h=298\" data-orig-size=\"1946,1288\" data-permalink=\"https:\/\/machinelearning-blog.com\/2018\/01\/24\/linear-regression\/attachment\/7\/\" \/><\/p>\n<p>In this picture, the horizontal axes represents the space of parameters \u03b80\u00a0and\u00a0\u03b81. The cost function J(\u03b80,\u00a0\u03b81) is some type of surface above them. So the hight of the surface represents the value of this surface at a certain point.<\/p>\n<p>Like you already know, we want to find values of \u03b80\u00a0and\u00a0\u03b81\u00a0that correspond to the minimum of the cost function (marked with the red arrow). This minimum of the cost function is called the global optimum. You can also see that gradient descent is a convex function, which is one of the main reasons why we use it for regression.<\/p>\n<p>To start with finding the right values we initialize the values of \u03b80\u00a0and\u00a0\u03b81\u00a0with some random numbers and Gradient Descent then starts at that initial point (somewhere around the top of our illustration) and then takes one step after another in the steepest downside direction (e.g. from the top to the bottom of the illustration), till it reaches the\u00a0<b>global optimum<\/b>\u00a0(where the cost function is as small as possible), which is marked with the red arrow.<\/p>\n<p>How big these steps are is determined by the so called\u00a0<b>Learning Rate<\/b>. It is important that the learning rate is neither too high or too low, because if it is too high it will take steps that are too big and will maybe not reach the global optimum and if it is too small it will simply take too much time till it reaches the global optimum.<\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/8.png?w=478&amp;h=241\" sizes=\"(max-width: 478px) 100vw, 478px\" srcset=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/8.png?w=478&amp;h=241 478w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/8.png?w=956&amp;h=482 956w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/8.png?w=150&amp;h=76 150w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/8.png?w=300&amp;h=151 300w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/8.png?w=768&amp;h=387 768w\" alt=\"8.png\" width=\"478\" height=\"241\" data-attachment-id=\"338\" data-comments-opened=\"1\" data-image-description=\"\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"8\" data-large-file=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/8.png?w=478&amp;h=241?w=736\" data-medium-file=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/8.png?w=478&amp;h=241?w=300\" data-orig-file=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/8.png?w=478&amp;h=241\" data-orig-size=\"1968,992\" data-permalink=\"https:\/\/machinelearning-blog.com\/2018\/01\/24\/linear-regression\/attachment\/8\/\" \/><\/p>\n<p>There is a way to find out if you have chosen the right learning rate. You need to generate a plot, with the number of iterations of gradient descent vs. the cost function (MSE).\u00a0\u00a0If the cost function is increasing at some point, you need to decrease the learning rate.<\/p>\n<p>&nbsp;<\/p>\n<p><b>Feature Normalization<\/b><\/p>\n<p>There is a way to speed up the computation of gradient descent, called feature normalization.\u00a0<b>Instead of initializing the parameters with random numbers, we choose numbers that are roughly in the same range.<\/b><\/p>\n<p>This works better because \u03b8 will descend quickly on small ranges and slowly on large ranges. Therefore gradient descent would take inefficient steps down to the global optimum when the variables are very uneven. We prevent this by having the input parameters in roughly the same range.<\/p>\n<p>There are 2 techniques that help us with this task:\u00a0<b>Feature scaling<\/b>\u00a0and\u00a0<b>Mean normalization<\/b>.<\/p>\n<p>With Feature scaling, we divide the input values by the range (i.e. the maximum value minus the minimum value) of the input variables, which results in a new range of just 1.<\/p>\n<p>With Mean normalization we subtract the average value for an input variable from the values for that input variable, resulting in a new average value for the input variable of just zero.<\/p>\n<p>&nbsp;<\/p>\n<p><b>Polynomial Regression<\/b><\/p>\n<p>Let\u2019s explain polynomial regression with an example. Imagine you have another dataset of houses and their corresponding prices and you want to again predict their prices. At the plotted dataset below, you can clearly see, that a straight line wouldn\u2019t fit the data very well.<\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/9.png?w=477&amp;h=278\" sizes=\"(max-width: 477px) 100vw, 477px\" srcset=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/9.png?w=477&amp;h=278 477w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/9.png?w=954&amp;h=556 954w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/9.png?w=150&amp;h=87 150w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/9.png?w=300&amp;h=175 300w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/9.png?w=768&amp;h=447 768w\" alt=\"9.png\" width=\"477\" height=\"278\" data-attachment-id=\"339\" data-comments-opened=\"1\" data-image-description=\"\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"9\" data-large-file=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/9.png?w=477&amp;h=278?w=736\" data-medium-file=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/9.png?w=477&amp;h=278?w=300\" data-orig-file=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/9.png?w=477&amp;h=278\" data-orig-size=\"1100,640\" data-permalink=\"https:\/\/machinelearning-blog.com\/2018\/01\/24\/linear-regression\/9-2\/\" \/><\/p>\n<p>The thing is our hypothesis function does not have to be a straight linear line all the time, if that doesn\u2019t fit the data well. We could also change the function so that it is a quadratic, cubic or square root function (or any other form). You can see an example below.<\/p>\n<p>&nbsp;<\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/10.png?w=469&amp;h=305\" sizes=\"(max-width: 469px) 100vw, 469px\" srcset=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/10.png?w=469&amp;h=305 469w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/10.png?w=938&amp;h=610 938w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/10.png?w=150&amp;h=98 150w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/10.png?w=300&amp;h=195 300w, https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/10.png?w=768&amp;h=499 768w\" alt=\"10.png\" width=\"469\" height=\"305\" data-attachment-id=\"340\" data-comments-opened=\"1\" data-image-description=\"\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"10\" data-large-file=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/10.png?w=469&amp;h=305?w=736\" data-medium-file=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/10.png?w=469&amp;h=305?w=300\" data-orig-file=\"https:\/\/machinelearningblogcom.files.wordpress.com\/2018\/01\/10.png?w=469&amp;h=305\" data-orig-size=\"1440,936\" data-permalink=\"https:\/\/machinelearning-blog.com\/2018\/01\/24\/linear-regression\/10-2\/\" \/><\/p>\n<p>The is called polynomial regression.\u00a0Unfortunately, explaining how this works in detail would go beyond the scope of this blog post, but we will definitely cover it in a future blog post.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Linear regression is one of the most popular and best understood algorithms in the machine learning landscape. Since regression tasks belong to the most common machine learning problems in supervised learning, every Machine Learning Engineer should have a&nbsp;thorough understanding of how it works. This blog post covers how the linear regression algorithm works, where it is used, how you can evaluate its performance and which tools &amp; techniques should be used along with it.<\/p>\n","protected":false},"author":413,"featured_media":2560,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"content-type":"","footnotes":""},"categories":[183],"tags":[92],"ppma_author":[2327],"class_list":["post-1656","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-ml","tag-machine-learning"],"authors":[{"term_id":2327,"user_id":413,"is_guest":0,"slug":"niklas-donges","display_name":"Niklas Donges","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/?s=96&d=mm&r=g","user_url":"","last_name":"Donges","first_name":"Niklas","job_title":"","description":"<a href=\"https:\/\/www.linkedin.com\/in\/niklas-donges\/\">Niklas Donges<\/a>&nbsp;is Machine Learning Engineer at SAP. He is a Technical Blogger for the &#039;Towards Data Science&#039; publication"}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1656","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/413"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=1656"}],"version-history":[{"count":4,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1656\/revisions"}],"predecessor-version":[{"id":29662,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1656\/revisions\/29662"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/2560"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=1656"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=1656"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=1656"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=1656"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}