{"id":2162,"date":"2019-12-30T02:52:24","date_gmt":"2019-12-30T02:52:24","guid":{"rendered":"http:\/\/kusuaks7\/?p=1767"},"modified":"2024-02-01T14:57:51","modified_gmt":"2024-02-01T14:57:51","slug":"beginners-guide-to-the-three-types-of-machine-learning","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/ai-ml\/beginners-guide-to-the-three-types-of-machine-learning\/","title":{"rendered":"Beginners Guide to the Three Types of Machine Learning"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"2162\" class=\"elementor elementor-2162\" data-elementor-post-type=\"post\">\n\t\t\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-5891a083 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"5891a083\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-60d50cb0\" data-id=\"60d50cb0\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-3b8b77e elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"3b8b77e\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-8d80fa4\" data-id=\"8d80fa4\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-c6c4f1b elementor-widget elementor-widget-text-editor\" data-id=\"c6c4f1b\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tMachine learning problems can generally be divided into three types.\u00a0Classification and regression, which are known as supervised learning, and unsupervised learning which in the context of machine learning applications often refers to clustering.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-4812de6 elementor-widget elementor-widget-text-editor\" data-id=\"4812de6\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tIn the following article, I am going to give a brief introduction to each of these three problems and will include a walkthrough in the popular python library\u00a0<a href=\"https:\/\/scikit-learn.org\/stable\/index.html\" target=\"_blank\" rel=\"noopener noreferrer\">scikit-learn<\/a>.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-7dd473e elementor-widget elementor-widget-text-editor\" data-id=\"7dd473e\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tBefore I start I\u2019ll give a brief explanation for the meaning behind the terms supervised and unsupervised learning.\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-f2dd9b8 elementor-widget elementor-widget-text-editor\" data-id=\"f2dd9b8\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<strong>Supervised Learning:<\/strong>\u00a0<em>In supervised learning, you have a known set of inputs (features) and a known set of outputs (labels). Traditionally these are known as X and y. The goal of the algorithm is to learn the mapping function that maps the input to the output. So that when given new examples of X the machine can correctly predict the corresponding y labels.<\/em>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-0a199f0 elementor-widget elementor-widget-text-editor\" data-id=\"0a199f0\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<strong>Unsupervised Learning:<\/strong><em>\u00a0In unsupervised learning, you only have a set of inputs (X) and no corresponding labels (y). The goal of the algorithm is to find previously unknown patterns in the data. Quite often these algorithms are used to find meaningful clusters of similar samples of X so in effect finding the categories intrinsic to the data.<\/em>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-fd0b6e0 elementor-widget elementor-widget-heading\" data-id=\"fd0b6e0\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\"><h3>Classification<\/h3>\n<\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-08a7e90 elementor-widget elementor-widget-text-editor\" data-id=\"08a7e90\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tIn classification, the outputs (y) are categories. These can be binary, for example, if we were classifying spam email vs not spam email. They can also be multiple categories such as classifying species of\u00a0<a href=\"https:\/\/archive.ics.uci.edu\/ml\/datasets\/iris\" target=\"_blank\" rel=\"noopener noreferrer\">flowers<\/a>, this is known as multiclass classification.\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-7db7d92 elementor-widget elementor-widget-text-editor\" data-id=\"7db7d92\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tLet\u2019s walk through a simple example of classification using scikit-learn. If you don\u2019t already have this installed it can be installed either via pip or conda as outlined\u00a0<a href=\"https:\/\/scikit-learn.org\/stable\/install.html\" target=\"_blank\" rel=\"noopener noreferrer\">here<\/a>.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-fad6211 elementor-widget elementor-widget-text-editor\" data-id=\"fad6211\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tScikit-learn has a number of datasets that can be directly accessed via the library. For ease in this article, I will be using these example datasets throughout. To illustrate classification I will use the wine dataset which is a multiclass classification problem. In the dataset, the inputs (X) consist of 13 features relating to various properties of each wine type. The known outputs (y) are wine types which in the dataset have been given a number 0, 1 or 2.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-e54594c elementor-widget elementor-widget-text-editor\" data-id=\"e54594c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tThe imports I am using for all the code in this article are shown below.\n<div style=\"background: #eee; border: 1px solid #ccc; padding: 5px 10px;\"><span style=\"font-family: courier new,courier,monospace;\">import pandas as pd\nimport numpy as npfrom sklearn.datasets import load_wine\nfrom sklearn.datasets import load_bostonfrom sklearn.model_selection import train_test_split\nfrom sklearn import preprocessingfrom sklearn.metrics import f1_score\nfrom sklearn.metrics import mean_squared_error\nfrom math import sqrtfrom sklearn.neighbors import KNeighborsClassifier\nfrom sklearn.svm import SVC, LinearSVC, NuSVC\nfrom sklearn.tree import DecisionTreeClassifier\nfrom sklearn.ensemble import RandomForestClassifier, AdaBoostClassifier, GradientBoostingClassifier\nfrom sklearn.discriminant_analysis import LinearDiscriminantAnalysis\nfrom sklearn.discriminant_analysis import QuadraticDiscriminantAnalysis\nfrom sklearn import linear_model\nfrom sklearn.linear_model import ElasticNetCV\nfrom sklearn.svm import SVRfrom sklearn.cluster import KMeans\nfrom yellowbrick.cluster import KElbowVisualizer\nfrom yellowbrick.cluster import SilhouetteVisualizer<\/span><\/div>\n&nbsp;\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-464586b elementor-widget elementor-widget-text-editor\" data-id=\"464586b\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tIn the below code I am downloading the data and converting to a pandas data frame.\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-0695594 elementor-widget elementor-widget-text-editor\" data-id=\"0695594\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<div style=\"background: #eee; border: 1px solid #ccc; padding: 5px 10px;\"><span style=\"font-family: courier new,courier,monospace;\">wine = load_wine()\nwine_df = pd.DataFrame(wine.data, columns=wine.feature_names)\nwine_df[&#8216;TARGET&#8217;] = pd.Series(wine.target)<\/span><\/div>\n&nbsp;\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-2d9b16d elementor-widget elementor-widget-text-editor\" data-id=\"2d9b16d\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\nThe next stage in a supervised learning problem is to split the data into test and train sets. The train set can be used by the algorithm to learn the mapping between inputs and outputs, and then the reserved test set can be used to evaluate how well the model has learned this mapping. In the below code I am using the scikit-learn model_selection function\u00a0<code>train_test_split<\/code>\u00a0to do this.\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-05c34c5 elementor-widget elementor-widget-text-editor\" data-id=\"05c34c5\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t&nbsp;\n<div style=\"background: #eee; border: 1px solid #ccc; padding: 5px 10px;\"><span style=\"font-family: courier new,courier,monospace;\">X_w = wine_df.drop([&#8216;TARGET&#8217;], axis=1)\ny_w = wine_df[&#8216;TARGET&#8217;]\nX_train_w, X_test_w, y_train_w, y_test_w = train_test_split(X_w, y_w, test_size=0.2)<\/span><\/div>\n&nbsp;\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-763072b elementor-widget elementor-widget-text-editor\" data-id=\"763072b\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tIn the next step, we need to choose the algorithm that will be best suited to learn the mapping in your chosen dataset. In scikit-learn there are many different algorithms to choose from, all of which use different functions and methods to learn the mapping, you can view the full list\u00a0<a href=\"https:\/\/scikit-learn.org\/stable\/supervised_learning.html#supervised-learning\" target=\"_blank\" rel=\"noopener noreferrer\">here<\/a>.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-5956a25 elementor-widget elementor-widget-text-editor\" data-id=\"5956a25\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tTo determine the best model I am running the following code. I am training the model using a selection of algorithms and obtaining the F1-score for each one. The F1 score is a good indicator of the overall accuracy of a classifier. I have written a detailed description of the various metrics that can be used to evaluate a classifier\u00a0<a href=\"https:\/\/towardsdatascience.com\/understanding-the-confusion-matrix-and-its-business-applications-c4e8aaf37f42\" target=\"_blank\" rel=\"noopener noreferrer\" class=\"broken_link\">here<\/a>.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-9a81196 elementor-widget elementor-widget-text-editor\" data-id=\"9a81196\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t&nbsp;\n<div style=\"background: #eee; border: 1px solid #ccc; padding: 5px 10px;\"><span style=\"font-family: courier new,courier,monospace;\">classifiers = [\nKNeighborsClassifier(3),\nSVC(kernel=&#8221;rbf&#8221;, C=0.025, probability=True),\nNuSVC(probability=True),\nDecisionTreeClassifier(),\nRandomForestClassifier(),\nAdaBoostClassifier(),\nGradientBoostingClassifier()\n]\nfor classifier in classifiers:\nmodel = classifier\nmodel.fit(X_train_w, y_train_w)\ny_pred_w = model.predict(X_test_w)\nprint(classifier)\nprint(&#8220;model score: %.3f&#8221; % f1_score(y_test_w, y_pred_w, average=&#8217;weighted&#8217;))<\/span><\/div>\n&nbsp;\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-94c500f elementor-widget elementor-widget-image\" data-id=\"94c500f\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/miro.medium.com\/max\/1236\/1*zOjll7N5gAWNNj8Rsj-aIQ.png\" alt=\"\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-1d8de45 elementor-widget elementor-widget-text-editor\" data-id=\"1d8de45\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tA perfect F1 score would be 1.0, therefore, the closer the number is to 1.0 the better the model performance. The results above suggest that the Random Forest Classifier is the best model for this dataset.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-36c204e elementor-widget elementor-widget-heading\" data-id=\"36c204e\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\"><h3>Regression<\/h3><\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-ecb05d8 elementor-widget elementor-widget-text-editor\" data-id=\"ecb05d8\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tIn regression, the outputs (y) are continuous values rather than categories. An example of regression would be predicting how many sales a store may make next month, or what the future price of your house might be.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-24e237f elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"24e237f\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-e576735\" data-id=\"e576735\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-4364b6b elementor-widget elementor-widget-text-editor\" data-id=\"4364b6b\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tAgain to illustrate regression I will use a dataset from scikit-learn known as the boston housing dataset. This consists of 13 features (X) which are various properties of a house such as the number of rooms, the age and crime rate for the location. The output (y) is the price of the house.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-235ea82 elementor-widget elementor-widget-text-editor\" data-id=\"235ea82\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tI am loading the data using the code below and splitting it into test and train sets using the same method I used for the wine dataset.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-b1dcd7a elementor-widget elementor-widget-text-editor\" data-id=\"b1dcd7a\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t&nbsp;\n<div style=\"background: #eee; border: 1px solid #ccc; padding: 5px 10px;\"><span style=\"font-family: courier new,courier,monospace;\">boston = load_boston()\nboston_df = pd.DataFrame(boston.data, columns=boston.feature_names)\nboston_df[&#8216;TARGET&#8217;] = pd.Series(boston.target)X_b = boston_df.drop([&#8216;TARGET&#8217;], axis=1)\ny_b = boston_df[&#8216;TARGET&#8217;]\nX_train_b, X_test_b, y_train_b, y_test_b = train_test_split(X_b, y_b, test_size=0.2)<\/span><\/div>\n&nbsp;\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-5a33ab9 elementor-widget elementor-widget-text-editor\" data-id=\"5a33ab9\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tWe can use this\u00a0<a href=\"https:\/\/scikit-learn.org\/stable\/tutorial\/machine_learning_map\/index.html\" target=\"_blank\" rel=\"noopener noreferrer\">cheat sheet<\/a>\u00a0to see the available algorithms suited to regression problems in scikit-learn. We will use similar code to the classification problem to loop through a selection and print out the scores for each.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-7076bfe elementor-widget elementor-widget-text-editor\" data-id=\"7076bfe\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\nThere are a number of different metrics used to evaluate regression models. These are all essentially error metrics and measure the difference between the actual and predicted values achieved by the model. I have used the root mean squared error (RMSE). For this metric, the closer to zero the value is the better the performance of the model. This\u00a0<a href=\"https:\/\/www.dataquest.io\/blog\/understanding-regression-error-metrics\/\" target=\"_blank\" rel=\"noopener noreferrer\">article<\/a>\u00a0gives a really good explanation of error metrics for regression problems.\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-aaaec0a elementor-widget elementor-widget-text-editor\" data-id=\"aaaec0a\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<div style=\"background: #eee; border: 1px solid #ccc; padding: 5px 10px;\"><span style=\"font-family: courier new,courier,monospace;\">regressors = [\nlinear_model.Lasso(alpha=0.1),\nlinear_model.LinearRegression(),\nElasticNetCV(alphas=None, copy_X=True, cv=5, eps=0.001, fit_intercept=True,\nl1_ratio=0.5, max_iter=1000, n_alphas=100, n_jobs=None,\nnormalize=False, positive=False, precompute=&#8217;auto&#8217;, random_state=0,\nselection=&#8217;cyclic&#8217;, tol=0.0001, verbose=0),\nSVR(C=1.0, cache_size=200, coef0=0.0, degree=3, epsilon=0.1,\ngamma=&#8217;auto_deprecated&#8217;, kernel=&#8217;rbf&#8217;, max_iter=-1, shrinking=True,\ntol=0.001, verbose=False),\nlinear_model.Ridge(alpha=.5)\n]for regressor in regressors:\nmodel = regressor\nmodel.fit(X_train_b, y_train_b)\ny_pred_b = model.predict(X_test_b)\nprint(regressor)\nprint(&#8220;mean squared error: %.3f&#8221; % sqrt(mean_squared_error(y_test_b, y_pred_b)))<\/span><\/div>\n<p style=\"text-align: center;\"><\/p>\n<p style=\"text-align: center;\"><img decoding=\"async\" style=\"width: 700px; height: 320px;\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/max\/1304\/1*UMmmKiZmII8yRmUEUYpJ9g.png\" \/><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-1a4c2c7 elementor-widget elementor-widget-text-editor\" data-id=\"1a4c2c7\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tThe RMSE score suggests that either the linear regression and ridge regression algorithms perform best for this dataset.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-2a95273 elementor-widget elementor-widget-heading\" data-id=\"2a95273\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\"><h3>Unsupervised learning<\/h3><\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-affcbde elementor-widget elementor-widget-text-editor\" data-id=\"affcbde\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tThere are a number of different types of unsupervised learning but for simplicity here I am going to focus on the\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Cluster_analysis\" target=\"_blank\" rel=\"noopener noreferrer\">clustering methods<\/a>. There are many different algorithms for clustering all of which use slightly different techniques to find clusters of inputs.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-fa56bb1 elementor-widget elementor-widget-text-editor\" data-id=\"fa56bb1\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tProbably one of the most widely used methods is Kmeans. This algorithm performs an iterative process whereby a specified number of randomly generated means are initiated. A distance metric,\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Euclidean_distance\" target=\"_blank\" rel=\"noopener noreferrer\">Euclidean<\/a>\u00a0distance is calculated for each data point from the centroids, thus creating clusters of similar values. The centroid of each cluster then becomes the new mean and this process is repeated until the optimum result has been achieved.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-2f625af elementor-widget elementor-widget-text-editor\" data-id=\"2f625af\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tLet\u2019s use the wine dataset we used in the classification task, with the y labels removed, and see how well the k-means algorithm can identify the wine types from the inputs.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-636ca96 elementor-widget elementor-widget-text-editor\" data-id=\"636ca96\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tAs we are only using the inputs for this model I am splitting the data into test and train using a slightly different method.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-b09cf5f elementor-widget elementor-widget-text-editor\" data-id=\"b09cf5f\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t&nbsp;\n<div style=\"background: #eee; border: 1px solid #ccc; padding: 5px 10px;\"><span style=\"font-family: courier new,courier,monospace;\">np.random.seed(0)\nmsk = np.random.rand(len(X_w)) &lt; 0.8\ntrain_w = X_w[msk]\ntest_w = X_w[~msk]<\/span><\/div>\n&nbsp;\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-2d02b9f elementor-widget elementor-widget-text-editor\" data-id=\"2d02b9f\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tAs Kmeans is reliant on the distance metric to determine the clusters it is usually necessary to perform feature scaling (ensuring that all features have the same scale) before training the model. In the below code I am using the MinMaxScaler to scale the features so that all values fall between 0 and 1.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-13ee091 elementor-widget elementor-widget-text-editor\" data-id=\"13ee091\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t&nbsp;\n<div style=\"background: #eee; border: 1px solid #ccc; padding: 5px 10px;\"><span style=\"font-family: courier new,courier,monospace;\">x = train_w.values\nmin_max_scaler = preprocessing.MinMaxScaler()\nx_scaled = min_max_scaler.fit_transform(x)\nX_scaled = pd.DataFrame(x_scaled,columns=train_w.columns)<\/span><\/div>\n&nbsp;\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-a3c88d3 elementor-widget elementor-widget-text-editor\" data-id=\"a3c88d3\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tWith K-means you have to specify the number of clusters the algorithm should use. So one of the first steps is to identify the optimum number of clusters. This is achieved by iterating through a number of values of k and plotting the results on a chart. This is known as the Elbow method as it typically produces a plot with a curve that looks a little like the curve of your elbow. The yellowbrick\u00a0<a href=\"https:\/\/www.scikit-yb.org\/en\/latest\/quickstart.html\" target=\"_blank\" rel=\"noopener noreferrer\">library<\/a>\u00a0(which is a great library for visualising scikit-learn models and can be pip installed) has a really nice plot for this. The code below produces this visualisation.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-7da7af0 elementor-widget elementor-widget-text-editor\" data-id=\"7da7af0\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t&nbsp;\n<div style=\"background: #eee; border: 1px solid #ccc; padding: 5px 10px;\"><span style=\"font-family: courier new,courier,monospace;\">model = KMeans()\nvisualizer = KElbowVisualizer(model, k=(1,8))\nvisualizer.fit(X_scaled)\nvisualizer.show()<\/span><\/div>\n<p style=\"text-align: center;\"><\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-e660f05 elementor-widget elementor-widget-image\" data-id=\"e660f05\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/miro.medium.com\/max\/1024\/1*AbDwDKrPaY0tOIiiNJe_tA.png\" alt=\"\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-a468cb2 elementor-widget elementor-widget-text-editor\" data-id=\"a468cb2\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tOrdinarily, we wouldn\u2019t already know how many categories we have in a dataset where we are using a clustering technique. However, in this case, we know that there are three wine types in the data \u2014 the curve has correctly selected three as the optimum number of clusters to use in the model.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-2a9e0e9 elementor-widget elementor-widget-text-editor\" data-id=\"2a9e0e9\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tThe next step is to initialise the K-means algorithm and fit the model to the training data and evaluate how effectively the algorithm has clustered the data.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-d9e7d0a elementor-widget elementor-widget-text-editor\" data-id=\"d9e7d0a\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tOne method used for this is known as the\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Silhouette_(clustering)\" target=\"_blank\" rel=\"noopener noreferrer\">silhouette score<\/a>. This measures the consistency of values within the clusters. Or in other words how similar to each other the values in each cluster are, and how much separation there is between the clusters. The silhouette score is calculated for each value and will range from -1 to +1. These values are then plotted to form a silhouette plot. Again yellowbrick provides a simple way to construct this type of plot. The code below creates this visualisation for the wine dataset.\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-bee27a9 elementor-widget elementor-widget-text-editor\" data-id=\"bee27a9\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t&nbsp;\n<div style=\"background: #eee; border: 1px solid #ccc; padding: 5px 10px;\"><span style=\"font-family: courier new,courier,monospace;\">model = KMeans(3, random_state=42)\nvisualizer = SilhouetteVisualizer(model, colors=&#8217;yellowbrick&#8217;)visualizer.fit(X_scaled)\nvisualizer.show()<\/span><\/div>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-ad6abf9 elementor-widget elementor-widget-image\" data-id=\"ad6abf9\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/miro.medium.com\/max\/970\/1*XwJVtULuoyAlrHs0nyg1xA.png\" alt=\"\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-7b41e8c elementor-widget elementor-widget-text-editor\" data-id=\"7b41e8c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tA silhouette plot can be interpreted in the following way:\n<ul>\n \t<li>The closer the mean score (which is the red dotted line in the above) is to +1 the better matched the data points are within the cluster.<\/li>\n \t<li>Data points with a score of 0 are very close to the decision boundary for another cluster (so the separation is low).<\/li>\n \t<li>Negative values indicate that the data points may have been assigned to the wrong cluster.<\/li>\n \t<li>The width of each cluster should be reasonably uniform if they aren\u2019t then the incorrect value of k may have been used.<\/li>\n<\/ul>\nThe plot for the wine data set above shows that cluster 0 may not be as consistent as the others due to most data points being below the average score and a few data points having a score below 0.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-1f6c78c elementor-widget elementor-widget-text-editor\" data-id=\"1f6c78c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tSilhouette scores can be particularly useful in comparing one algorithm against another or different values of k.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-8e67581 elementor-widget elementor-widget-text-editor\" data-id=\"8e67581\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tIn this post, I wanted to give a brief introduction to each of the three types of machine learning. There are many other steps involved in all of these processes including feature engineering, data processing and hyperparameter optimisation to determine both the best data preprocessing techniques and the best models to use.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>Machine learning problems can generally be divided into three types.\u00a0Classification and regression, which are known as supervised learning, and unsupervised learning which in the context of machine learning applications often refers to clustering.In the following article, I am going to give a brief introduction to each of these three problems and will include a walkthrough<\/p>\n","protected":false},"author":795,"featured_media":3180,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"content-type":"","footnotes":""},"categories":[183],"tags":[92],"ppma_author":[2924],"class_list":["post-2162","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-ml","tag-machine-learning"],"authors":[{"term_id":2924,"user_id":795,"is_guest":0,"slug":"rebecca-vickery","display_name":"Rebecca Vickery","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/?s=96&d=mm&r=g","user_url":"","last_name":"Vickery","first_name":"Rebecca","job_title":"","description":"Rebecca Vickery is a Data Scientist at Holiday Extras. She has been working in data &amp; analytics in the Travel industry for the past 10 years. Areas of interest include machine learning, customer lifecycle analytics, python and sql development."}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/2162","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/795"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=2162"}],"version-history":[{"count":4,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/2162\/revisions"}],"predecessor-version":[{"id":35816,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/2162\/revisions\/35816"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/3180"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=2162"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=2162"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=2162"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=2162"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}