{"id":1372,"date":"2019-02-15T10:32:06","date_gmt":"2019-02-15T10:32:06","guid":{"rendered":"http:\/\/kusuaks7\/?p=977"},"modified":"2023-07-27T15:58:26","modified_gmt":"2023-07-27T15:58:26","slug":"coding-deep-learning-for-beginners-linear-regression-part-1-initialization-and-prediction","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/ai-ml\/coding-deep-learning-for-beginners-linear-regression-part-1-initialization-and-prediction\/","title":{"rendered":"Coding Deep Learning for Beginners\u200a\u2014\u200aLinear Regression (Part 1): Initializtion and Prediction"},"content":{"rendered":"<p><strong><em>Ready to learn Machine Learning?Browse<\/em><\/strong> <strong><em><a href=\"https:\/\/www.experfy.com\/training\/tracks\/machine-learning-training-certification\">Machine Learning Training and Certification courses<\/a> developed by industry thought leaders and Experfy in Harvard Innovation Lab.<\/em><\/strong><\/p>\n<blockquote><p>This is the 3rd article of series \u201c<strong>Coding Deep Learning for Beginners<\/strong>\u201d. Here, you will find\u00a0<em>links to the <\/em><a href=\"https:\/\/www.experfy.com\/blog\/coding-deep-learning-for-beginners-start\">1st article <\/a><em>and the\u00a0<a href=\"https:\/\/www.experfy.com\/blog\/coding-deep-learning-for-beginners-types-of-machine-learning\">2<sup>nd<\/sup> article<\/a>.<\/em><\/p><\/blockquote>\n<section>\n<h3 id=\"ad43\"><strong>Why Linear Regression?<\/strong><\/h3>\n<p id=\"5faa\">Some of you may wonder, why the article series about explaining and coding Neural Networks starts with<strong>\u00a0basic Machine Learning algorithm<\/strong>\u00a0such as Linear Regression. It\u2019s very justifiable to start from there. First of all, it is a very plain algorithm so the reader can grasp an\u00a0<strong>understanding of fundamental Machine Learning concepts<\/strong>\u00a0such as\u00a0<em>Supervised Learning<\/em>,\u00a0<em>Cost Function<\/em>, and\u00a0<em>Gradient Descent<\/em>. Additionally, after learning Linear Regression it is quite easy to understand Logistic Regression algorithm and believe or not \u2014 it is possible to categorise that one as small Neural Network. You can expect all of those and even more covered in few next articles!<\/p>\n<h3 id=\"6249\"><strong>Tools<\/strong><\/h3>\n<p id=\"a6ce\">Let\u2019s introduce the\u00a0<strong>most popular libraries<\/strong>\u00a0that can be found in every Python based Machine Learning or Data Science related project.<\/p>\n<ul>\n<li id=\"d729\"><a href=\"http:\/\/www.numpy.org\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/www.numpy.org\/\" data->NumPy<\/a> \u2014 a library for scientific computing, perfect for Multivariable Calculus &amp; Linear Algebra. Provides\u00a0<a href=\"https:\/\/docs.scipy.org\/doc\/numpy-1.14.0\/reference\/generated\/numpy.ndarray.html\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/docs.scipy.org\/doc\/numpy-1.14.0\/reference\/generated\/numpy.ndarray.html\" data->ndarray<\/a>\u00a0class which can be compared to<strong>\u00a0Python list that can be treated as vector or matrix<\/strong>.<\/li>\n<li id=\"8e6b\"><a href=\"https:\/\/matplotlib.org\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/matplotlib.org\/\" data->Matplotlib<\/a> \u2014 toolkit for\u00a0<strong>data visualisation<\/strong>, allows to create various 2d and 3d graphs.<\/li>\n<li id=\"7900\"><a href=\"https:\/\/pandas.pydata.org\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/pandas.pydata.org\/\" data->Pandas<\/a>\u2014this library is a wrapper for Matplotlib and NumPy libraries. It provides\u00a0<a href=\"https:\/\/pandas.pydata.org\/pandas-docs\/stable\/generated\/pandas.DataFrame.html\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/pandas.pydata.org\/pandas-docs\/stable\/generated\/pandas.DataFrame.html\" data->DataFrame<\/a>\u00a0class. It\u00a0<strong>treats NumPy matrices as tables<\/strong>, allowing access to rows and columns by their attached names. Very helpful in\u00a0<strong>data loading, saving, wrangling, and exploration process<\/strong>. Provides an interface of functions that makes deployment faster.<\/li>\n<\/ul>\n<p id=\"229d\">Each library can be installed separately with using\u00a0<a href=\"https:\/\/pypi.org\/project\/pip\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/pypi.org\/project\/pip\/\" data->Python PyPi<\/a>. They will be imported in code of every article under following aliases.<\/p>\n<p>&nbsp;<\/p>\n<h3 id=\"edd3\"><strong>What is Linear Regression?<\/strong><\/h3>\n<p id=\"cdad\">It\u2019s a\u00a0<strong>Supervised Learning algorithm<\/strong>\u00a0which goal is to\u00a0<strong>predict continuous, numerical values based on given data input<\/strong>. From the geometrical perspective, each data sample is a point. Linear Regression tries to\u00a0<strong>find parameters of the linear function<\/strong>, so the\u00a0<strong>distance between the all the points and the line is as small as possible<\/strong>. Algorithm used for parameters update is called\u00a0<strong>Gradient Descent<\/strong>.<\/p>\n<figure id=\"5f97\"><canvas width=\"75\" height=\"37\"><\/canvas><img decoding=\"async\" style=\"width: 700px; height: 368px;\" src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/1*IjxpxWcKX8EJUVFBNFeKdA.gif\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/1*IjxpxWcKX8EJUVFBNFeKdA.gif\" \/><\/figure>\n<p id=\"8549\" style=\"text-align: center;\">Training of Linear Regression model. The left graph displays the change of linear function parameters over time. The plot on the right renders the linear function using current parameters (source:\u00a0<a href=\"https:\/\/github.com\/llSourcell\/linear_regression_live\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/github.com\/llSourcell\/linear_regression_live\" data->Siraj Raval\u00a0GitHub<\/a>).<\/p>\n<p>For example, if we have a dataset consisting of apartments properties and their prices in some specific area, Linear Regression algorithm can be used to find a mathematical function which will try to estimate the value of different apartment (outside of the dataset), based on its attributes.<\/p>\n<figure id=\"3588\"><canvas width=\"75\" height=\"15\"><\/canvas><img decoding=\"async\" style=\"width: 700px; height: 148px;\" src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/1*eOrewomaMFP0E1fNyt8waw.png\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/1*eOrewomaMFP0E1fNyt8waw.png\" \/><\/figure>\n<p id=\"e9b4\">Another example can be a prediction of food supply size for the grocery store, based on sales data. That way the business can decrease unnecessary food waste.\u00a0<strong>Such mapping is achievable for any correlated input-output data pairs.<\/strong><\/p>\n<figure id=\"cba6\"><canvas width=\"75\" height=\"15\"><\/canvas><img decoding=\"async\" style=\"width: 700px; height: 151px;\" src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/1*1lagOXTq9rdWyJz-zUr-yQ.png\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/1*1lagOXTq9rdWyJz-zUr-yQ.png\" \/><\/figure>\n<h3 id=\"f1a4\"><strong>Data preparation<\/strong><\/h3>\n<p id=\"d759\">Before coding Linear Regression part, it would be good to have some problem to solve. It is possible to find a lot of datasets on websites like\u00a0UCI Repositoryor\u00a0<a href=\"https:\/\/www.kaggle.com\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.kaggle.com\/\" data->Kaggle<\/a>. After going through many of those, none was suitable for study case of this article.<\/p>\n<p id=\"1c9d\">In order to get data, I\u2019ve entered Polish website\u00a0<a href=\"https:\/\/www.dominium.pl\/pl\/szukaj\/mieszkania\/nowe\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.dominium.pl\/pl\/szukaj\/mieszkania\/nowe\" data->dominium.pl<\/a>, which is a search engine for apartments in Cracow city \u2014 area where I live. I have entirely randomly chosen 76 apartments, written down their attributes and saved to the\u00a0<strong>.csv file<\/strong>. The goal was to\u00a0<strong>train Linear Regression model capable of predicting apartments prices<\/strong>\u00a0in Cracow.<\/p>\n<p id=\"9b3a\">Dataset is available on my Dropbox under this\u00a0<a href=\"https:\/\/www.dropbox.com\/s\/1octs0jg5o5j82o\/cracow_apartments.csv?dl=0\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.dropbox.com\/s\/1octs0jg5o5j82o\/cracow_apartments.csv?dl=0\" data->link<\/a>.<\/p>\n<h4 id=\"d873\"><strong>Loading data<\/strong><\/h4>\n<p id=\"b520\">Let\u2019s start by reading data from the\u00a0.csv file to DataFrame object of Pandas and displaying a few data rows. To achieve that\u00a0<a href=\"https:\/\/pandas.pydata.org\/pandas-docs\/stable\/generated\/pandas.read_csv.html\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/pandas.pydata.org\/pandas-docs\/stable\/generated\/pandas.read_csv.html\" data->read_csv<\/a>\u00a0function will be used. Data is separated with colon character which is why\u00a0<code>sep=\",\"<\/code>\u00a0parameter was added. Function\u00a0<a href=\"https:\/\/pandas.pydata.org\/pandas-docs\/stable\/generated\/pandas.DataFrame.head.html\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/pandas.pydata.org\/pandas-docs\/stable\/generated\/pandas.DataFrame.head.html\" data->head<\/a>\u00a0renders first five rows of data in the form of the pleasantly readable HTML table.<\/p>\n<p>&nbsp;<\/p>\n<p id=\"3cf2\">The output of the code looks as following:<\/p>\n<figure id=\"5998\"><canvas width=\"75\" height=\"20\"><\/canvas><img decoding=\"async\" style=\"width: 700px; height: 208px;\" src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/1*y2uKxTDyKCSJd-K8eUGvrw.png\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/1*y2uKxTDyKCSJd-K8eUGvrw.png\" \/><\/figure>\n<p id=\"c7ec\" style=\"text-align: center;\">DataFrame visualisation in\u00a0<a href=\"http:\/\/jupyter.org\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/jupyter.org\/\" data->Jupyter Notebook<\/a>.<\/p>\n<p>As presented in the table, there are\u00a0<strong>four features<\/strong>\u00a0describing apartment properties:<\/p>\n<ul>\n<li id=\"0676\"><strong>distance_to_city_center<\/strong>\u00a0&#8211; distance from dwelling to\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Main_Square,_Krak%C3%B3w\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/en.wikipedia.org\/wiki\/Main_Square,_Krak%C3%B3w\" data->Cracow Main Square<\/a>on foot, measured with Google Maps,<\/li>\n<li id=\"995b\"><strong>rooms<\/strong>\u00a0&#8211; the number of rooms in the apartment,<\/li>\n<li id=\"d6fd\"><strong>size<\/strong>\u00a0&#8211; the area of the apartment measured in square meters,<\/li>\n<li id=\"e67d\"><strong>price<\/strong>\u00a0&#8211; target value (the one that needs to be predicted by model), cost of the apartment measured in Polish national currency\u2014<a href=\"https:\/\/en.wikipedia.org\/wiki\/Polish_z%C5%82oty\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/en.wikipedia.org\/wiki\/Polish_z%C5%82oty\" data->z\u0142oty<\/a>.<\/li>\n<\/ul>\n<h4 id=\"1b48\"><strong>Visualising data<\/strong><\/h4>\n<p id=\"f16a\">It is very important to always understand the structure of data. The more features there are, the harder it is. In this case,\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Scatter_plot\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/en.wikipedia.org\/wiki\/Scatter_plot\" data->scatter plot<\/a>\u00a0is used to\u00a0<strong>display the relationship between target and training features<\/strong>.<\/p>\n<figure id=\"d260\" data-scroll=\"native\"><canvas width=\"75\" height=\"15\"><\/canvas><img decoding=\"async\" style=\"width: 700px; height: 158px;\" src=\"https:\/\/cdn-images-1.medium.com\/max\/1000\/1*NmeKDxBgGdt7D2SvaXroAg.png\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/1000\/1*NmeKDxBgGdt7D2SvaXroAg.png\" \/><\/figure>\n<p id=\"891f\" style=\"text-align: center;\">Charts show whole data from cracow_apartments.csv. It was prepared with Matplotlib library in Jupyter Notebook. The code used to create these charts can be found under this\u00a0<a href=\"https:\/\/gist.github.com\/FisherKK\/0113b1eda361856a1cd29ad4fbd180d2\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/gist.github.com\/FisherKK\/0113b1eda361856a1cd29ad4fbd180d2\" data->link<\/a>.<\/p>\n<p>Depending on what is necessary to show, some other types of visualization (e.g.\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Box_plot\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/en.wikipedia.org\/wiki\/Box_plot\" data->box plot<\/a>) and techniques could be useful (e.g.\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Cluster_analysis\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/en.wikipedia.org\/wiki\/Cluster_analysis\" data->clustering<\/a>). Here, a\u00a0<strong>linear dependency between features can be observed<\/strong> \u2014 with the increase of values on axis x, values on the y-axis are linearly increasing or decreasing accordingly. It\u2019s great because if that was not the case (e.g. relationship would be exponential), then it would be hard to fit a line through all the points and different algorithm should be considered.<\/p>\n<h3 id=\"c10a\"><strong>Formula<\/strong><\/h3>\n<p id=\"97b3\">The Linear Regression\u00a0<strong>model is a mathematical formula<\/strong>\u00a0that takes\u00a0<strong>vector of numerical values\u00a0<\/strong>(attributes of single data sample) as an input and uses them to\u00a0<strong>make a prediction<\/strong>.<\/p>\n<p id=\"68e3\">Mapping the same statement in the context of the presented problem, there are 76 samples containing attributes of Cracow apartments where each sample is a vector from mathematical perspective. Each\u00a0<strong>vector of features is paired with target value\u00a0<\/strong>(expected result from formula).<\/p>\n<figure id=\"95f0\"><img decoding=\"async\" style=\"width: 700px; height: 39px;\" src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/1*rRubXYPCZO-ZuEwGEaF4uw.png\" data-action=\"zoom\" data-action-value=\"1*rRubXYPCZO-ZuEwGEaF4uw.png\" data-height=\"92\" data-image-id=\"1*rRubXYPCZO-ZuEwGEaF4uw.png\" data-width=\"1632\" \/><\/figure>\n<p id=\"8b5d\">According to the algorithm,\u00a0<strong>every feature has a weight parameter assigned.\u00a0<\/strong>It represents it\u2019s<strong>\u00a0importance\u00a0<\/strong>to the model. The goal is to find the values of weights so the following equation is met for every apartment data.<\/p>\n<figure id=\"45f7\"><canvas width=\"75\" height=\"5\"><\/canvas><img decoding=\"async\" style=\"width: 700px; height: 62px;\" src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/1*xTNtQFAzNYsW_Amw1Mifxw.png\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/1*xTNtQFAzNYsW_Amw1Mifxw.png\" \/><\/figure>\n<p id=\"56ab\">The left side of the equation is a\u00a0<strong>linear function<\/strong>. As\u00a0<strong>manipulation of weight values can change an angle of the line<\/strong>.<strong>\u00a0<\/strong>Although, there is a still one element missing. Current function is always going through (0,0) point of the coordinate system. To fix that, another trainable parameter is added.<\/p>\n<figure id=\"170b\"><canvas width=\"75\" height=\"5\"><\/canvas><img decoding=\"async\" style=\"width: 700px; height: 62px;\" src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/1*gwJByt1-1THrDvWjNI_-Vg.png\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/1*gwJByt1-1THrDvWjNI_-Vg.png\" \/><\/figure>\n<p id=\"7080\">The parameter is named\u00a0<strong>bias and it gives the formula a freedom to move on the y-axis up and down<\/strong>.<\/p>\n<p id=\"fc75\">The purple parameters belong to the model and are used for prediction for every incoming sample. That\u2019s why finding a solution that works best for all samples is necessary. Formally the formula can be written as:<\/p>\n<figure id=\"1b69\"><canvas width=\"75\" height=\"20\"><\/canvas><img decoding=\"async\" style=\"width: 700px; height: 192px;\" src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/1*mtQi7bOrBYkTLAt541CHkA.png\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/1*mtQi7bOrBYkTLAt541CHkA.png\" \/><\/figure>\n<h3 id=\"8511\"><strong>Initialization<\/strong><\/h3>\n<p id=\"80a3\">It\u2019s a phase where the\u00a0<strong>first version of a model is created<\/strong>. Model after initialization can already be used for prediction but without training process, the results will be far from good. There are two things to be done:<\/p>\n<ul>\n<li id=\"0eac\"><strong>create variables in code<\/strong>\u00a0that represents weights and bias parameters,<\/li>\n<li id=\"a848\"><strong>decide on starting values<\/strong>\u00a0of model parameters.<\/li>\n<\/ul>\n<p id=\"d972\">Initial values of model parameters are very crucial for Neural Networks. In case of Linear Regression\u00a0<strong>parameter values can be set to zero<\/strong>\u00a0at the start.<\/p>\n<p>&nbsp;<\/p>\n<p id=\"0bfc\">Function\u00a0<code>init(n)<\/code>\u00a0returns a dictionary containing model parameters. According to the terminology presented in the legend below the mathematical formula,\u00a0<em>n is the number of features<\/em>\u00a0used to describe data sample. It is used by\u00a0<a href=\"https:\/\/docs.scipy.org\/doc\/numpy-1.14.0\/reference\/generated\/numpy.zeros.html\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/docs.scipy.org\/doc\/numpy-1.14.0\/reference\/generated\/numpy.zeros.html\" data->zeros<\/a>\u00a0function of NumPy library, to return a vector of ndarray type with n elements and zero value assigned to each. Bias is a scalar set to 0.0 and<strong>\u00a0it is a good practice to keep the variables as floats rather than integers<\/strong>. Both weights and bias are accessible under \u201cw\u201d and \u201cb\u201d dictionary keys accordingly.<\/p>\n<p id=\"078b\">For Cracow apartment dataset there are three features describing each sample. Here is the result of calling\u00a0<code>init(3)<\/code>\u00a0:<\/p>\n<figure id=\"dc36\"><img decoding=\"async\" style=\"width: 700px; height: 19px;\" src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/1*oXFspX1GcZSOIorvnezazw.png\" data-action=\"zoom\" data-action-value=\"1*oXFspX1GcZSOIorvnezazw.png\" data-height=\"44\" data-image-id=\"1*oXFspX1GcZSOIorvnezazw.png\" data-width=\"1534\" \/><\/figure>\n<h3 id=\"827d\"><strong>Prediction<\/strong><\/h3>\n<p id=\"9fbb\">Created model parameters can be used by the model for making a prediction. The formula has been already shown. Now it\u2019s time to turn it into the Python code. First, every feature has to be multiplied by its corresponding weight and summed up. Then bias parameter needs to be added to the product of the previous operation. The outcome is a prediction.<\/p>\n<p>&nbsp;<\/p>\n<p id=\"8a21\">Function\u00a0<code><span style=\"background-color: #f0f8ff;\">predict(x, parameters)<\/span><\/code>\u00a0takes two arguments:<\/p>\n<ul>\n<li id=\"5a77\">vector\u00a0<code><span style=\"background-color: #f0f8ff;\">x<\/span><\/code><span style=\"background-color: #f0f8ff;\">\u00a0<\/span>of features representing a data sample (e.g. single apartment),<\/li>\n<li id=\"44ac\">Python dictionary\u00a0<code><span style=\"background-color: #f0f8ff;\">parameters<\/span><\/code>\u00a0which stores parameters of the model along with their current state.<\/li>\n<\/ul>\n<h3 id=\"6634\"><strong>Assemble<\/strong><\/h3>\n<p id=\"ff71\">Let\u2019s put together all code parts that were created and display at the results.<\/p>\n<p>&nbsp;<\/p>\n<p id=\"1280\"><strong>Only one feature was used for prediction\u00a0<\/strong>what<strong>\u00a0<\/strong>reduced formula to form:<\/p>\n<figure id=\"ebea\"><img decoding=\"async\" style=\"width: 700px; height: 32px;\" src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/1*y41rsL-etM_NFXKC42zU7A.png\" data-action=\"zoom\" data-action-value=\"1*y41rsL-etM_NFXKC42zU7A.png\" data-height=\"68\" data-image-id=\"1*y41rsL-etM_NFXKC42zU7A.png\" data-width=\"1450\" \/><\/figure>\n<p id=\"5d83\">This was intentional as\u00a0<strong>displaying results on the data which has more than 1\u20132\u20133 dimensions becomes troublesome<\/strong>, unless\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Dimensionality_reduction\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/en.wikipedia.org\/wiki\/Dimensionality_reduction\" data->Dimensionality Reduction<\/a>techniques are used (e.g.\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Principal_component_analysis\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/en.wikipedia.org\/wiki\/Principal_component_analysis\" data->PCA<\/a>). From now on, for learning purposes all code development will be done only on\u00a0<strong>size<\/strong>\u00a0feature. When Linear Regression code will be finished, results with usage of all features will be presented.<\/p>\n<figure id=\"7652\"><canvas width=\"75\" height=\"50\"><\/canvas><img decoding=\"async\" style=\"width: 404px; height: 274px;\" src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/1*8ImNJi8Bbzgb3GdjtgR56A.png\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/1*8ImNJi8Bbzgb3GdjtgR56A.png\" \/><\/figure>\n<p id=\"d34f\" style=\"text-align: center;\">Line used to fit the data by Linear Regression model with current parameters. Code for visualisation is available under this\u00a0<a href=\"https:\/\/gist.github.com\/FisherKK\/a78a54d4fa9bdd56c9512f24b98df5f9\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/gist.github.com\/FisherKK\/a78a54d4fa9bdd56c9512f24b98df5f9\" data->link<\/a>.<\/p>\n<p>The model parameters were initialized with zero values which means that the output of the formula will always be equal to zero. Consequently, the\u00a0<code>prediction<\/code>\u00a0is a Python list of 76 zero values which are predicted prices for each apartment separately. But that\u2019s ok for now.\u00a0<strong>Model behavior will improve after training with the Gradient Descent is used and explained.<\/strong><\/p>\n<p id=\"c2c0\">Bonus takeouts from the code snippet are:<\/p>\n<ul>\n<li id=\"f562\">Features to be used by model and target value were stored in\u00a0<code>features<\/code>and\u00a0<code>target<\/code>\u00a0Python lists. Thanks to that there is no need to modify the whole code if a different set of features should be used.<\/li>\n<li id=\"43bb\">It is possible to parse DataFrame object to ndarray by using\u00a0as_matrixfunction.<\/li>\n<\/ul>\n<\/section>\n<section>\n<hr \/>\n<h3 id=\"f321\"><strong>Summary<\/strong><\/h3>\n<p id=\"8402\">In this article, I have introduced the tools that I am going to use in the whole article series. Then I have presented the problem I am going to solve with Linear Regression algorithm. At the end, I have shown how to create Linear Regression model and use it for making a prediction.<\/p>\n<p id=\"f0b2\">In the next article I will explain how to compare sets of parameters and measure model performance. Finally, I will show how to update model parameters with Gradient Descent algorithm.<\/p>\n<\/section>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Linear Regression is a very plain algorithm so the reader can grasp an&nbsp;understanding of fundamental Machine Learning concepts&nbsp;such as&nbsp;Supervised Learning,&nbsp;Cost Function, and&nbsp;Gradient Descent. Additionally, after learning Linear Regression it is quite easy to understand Logistic Regression algorithm and believe or not &mdash; it is possible to categorise that one as small Neural Network. &nbsp;Linear Regression is a&nbsp;Supervised Learning algorithm&nbsp;which goal is to&nbsp;predict continuous, numerical values based on given data input.&nbsp;<\/p>\n","protected":false},"author":321,"featured_media":3158,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"content-type":"","footnotes":""},"categories":[183],"tags":[97],"ppma_author":[2905],"class_list":["post-1372","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-ml","tag-artificial-intelligence"],"authors":[{"term_id":2905,"user_id":321,"is_guest":0,"slug":"kamil-krzyk","display_name":"Kamil Krzyk","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/?s=96&d=mm&r=g","user_url":"","last_name":"Krzyk","first_name":"Kamil","job_title":"","description":"Kamil Krzyk is Data Scientist at <a href=\"http:\/\/www.azimo.com\/\">Azimo<\/a>,&nbsp; In the past, he was a Full Stack Engineer on the mobile team. Passionate about Machine Learning technology, he focuses on building software components which use data and math as its core."}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1372","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/321"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=1372"}],"version-history":[{"count":4,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1372\/revisions"}],"predecessor-version":[{"id":29641,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1372\/revisions\/29641"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/3158"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=1372"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=1372"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=1372"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=1372"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}