{"id":22579,"date":"2021-01-22T10:56:27","date_gmt":"2021-01-22T10:56:27","guid":{"rendered":"https:\/\/www.experfy.com\/blog\/ensemble-learning-bagging-boosting\/"},"modified":"2023-09-05T13:42:37","modified_gmt":"2023-09-05T13:42:37","slug":"ensemble-learning-bagging-boosting","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/ai-ml\/ensemble-learning-bagging-boosting\/","title":{"rendered":"Ensemble Learning: Bagging &#038; Boosting"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"22579\" class=\"elementor elementor-22579\" data-elementor-post-type=\"post\">\n\t\t\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-b785720 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-eae-slider=\"92792\" data-id=\"b785720\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-7599728\" data-eae-slider=\"88570\" data-id=\"7599728\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-26d66fc elementor-widget elementor-widget-text-editor\" data-id=\"26d66fc\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p class=\"has-medium-font-size\"><em>How to combine weak learners to build a stronger learner to reduce bias and variance in your ML model<\/em><\/p>\n\n<p id=\"953d\">The bias and variance tradeoff is one of the key concerns when working with machine learning algorithms. Fortunately there are some <strong>Ensemble Learning <\/strong>based techniques that machine learning practitioners can take advantage of in order to tackle the bias and variance tradeoff, these techniques are <strong>bagging<\/strong> and <strong>boosting<\/strong>. So, in this blog we are going to explain how <strong>bagging<\/strong> and <strong>boosting<\/strong> works, what theirs components are and how you can implement them in your ML problem, thus this blog will be divided in the following sections:<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-18b6a6c elementor-widget elementor-widget-text-editor\" data-id=\"18b6a6c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<ul><li><strong>What is Bagging?<\/strong><\/li><li><strong>What is Boosting?<\/strong><\/li><li><strong>AdaBoost<\/strong><\/li><li><strong>Gradient Boosting<\/strong><\/li><\/ul>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-b9d8b73 elementor-widget elementor-widget-heading\" data-id=\"b9d8b73\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">What is Bagging?<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-c5ab07b elementor-widget elementor-widget-text-editor\" data-id=\"c5ab07b\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"ac4c\"><strong>Bagging<\/strong> or <strong>Bootstrap Aggregation<\/strong> was formally introduced by Leo Breiman in 1996 [<a href=\"https:\/\/link.springer.com\/article\/10.1023\/A:1018054314350\" target=\"_blank\" rel=\"noreferrer noopener\">3<\/a>]. <strong>Bagging<\/strong> is an <strong>Ensemble Learning<\/strong> technique which aims to reduce the error learning through the implementation of a set of homogeneous machine learning algorithms. The key idea of <strong>bagging<\/strong> is the use of multiple base learners which are trained separately with a random sample from the training set, which through a voting or averaging approach, produce a more stable and accurate model.<\/p>\n\n<p id=\"d4eb\">The main two components of <strong>bagging<\/strong> technique are: the <em>random sampling with replacement <\/em>(<strong>bootstraping<\/strong>)and the <em>set of homogeneous <\/em>machine learning algorithms (<strong>ensemble learning<\/strong>). The <strong>bagging<\/strong> process is quite easy to understand, first it is extracted \u201c<em>n<\/em>\u201d subsets from the training set, then these subsets are used to train \u201c<em>n<\/em>\u201d base learners of the same type. For making a prediction, each one of the \u201c<em>n<\/em>\u201d learners are feed with the test sample, the output of each learner is averaged (in case of regression) or voted (in case of classification). Figure 2 shows an overview of the <strong>bagging<\/strong> architecture.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-5541814 elementor-widget elementor-widget-image\" data-id=\"5541814\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img fetchpriority=\"high\" decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1U4-AMDI9S57OyjmJlIsUWw-1024x576.jpeg\" class=\"attachment-large size-large wp-image-18504\" alt=\"Bagging architecutre - Ensemble Learning: Bagging &amp; Boosting\" srcset=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1U4-AMDI9S57OyjmJlIsUWw-1024x576.jpeg 1024w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1U4-AMDI9S57OyjmJlIsUWw-300x169.jpeg 300w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1U4-AMDI9S57OyjmJlIsUWw-768x432.jpeg 768w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1U4-AMDI9S57OyjmJlIsUWw-1536x864.jpeg 1536w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1U4-AMDI9S57OyjmJlIsUWw-610x343.jpeg 610w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1U4-AMDI9S57OyjmJlIsUWw-750x422.jpeg 750w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1U4-AMDI9S57OyjmJlIsUWw-1140x641.jpeg 1140w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1U4-AMDI9S57OyjmJlIsUWw.jpeg 1920w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-1d2562d elementor-widget elementor-widget-text-editor\" data-id=\"1d2562d\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"12eb\">It is important to notice that the number of <em>subsets<\/em> as well as the number of items per <em>subset<\/em> will be determined by the nature of your ML problem, the same for the type of ML algorithm to be used. In addition, Leo Breiman mention in his paper that he noticed that for classification problems are required more <em>subsets<\/em> in comparison with regression problems.<\/p>\n\n<p id=\"18c0\">For implementing <strong>bagging<\/strong>, scikit-learn provides a function to do it easily. For a basic execution we only need to provide some parameters such as the <em>base learner<\/em>, the <em>number of estimators<\/em> and the <em>maximum number of samples<\/em> per subset.<\/p>\n\n<div class='gist '><br \/><\/div>\n\n<p id=\"be08\">In the previous code snippet was created a <em>bagging based model<\/em> for the well know <em>breast cancer dataset. <\/em>As base learner was implemented a Decision Tree, 5 subsets were created randomly with replacement from the training set (to train 5 decision tree models). The number of items per subset were 50. By running it we will get:<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-8d3d3a5 elementor-widget elementor-widget-text-editor\" data-id=\"8d3d3a5\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<pre class=\"wp-block-preformatted\">Train score: 0.9583568075117371<br>Test score: 0.941048951048951<\/pre>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-5d98e68 elementor-widget elementor-widget-text-editor\" data-id=\"5d98e68\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"6ce2\">One of the key advantages of <strong>bagging<\/strong> is that it can be executed in parallel since there is no dependency between estimators. For <em>small datasets<\/em>, a few estimators will be enough (such as the example above), <em>larger dataset<\/em> may require more estimators.<\/p>\n\n<p id=\"6927\">Great, so far we\u2019ve already seen what <strong>bagging <\/strong>is and how it works. Let\u2019s see what <strong>boosting <\/strong>is, its components and why it is related to <strong>bagging<\/strong>, let\u2019s go for it!<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-2d3540e elementor-widget elementor-widget-heading\" data-id=\"2d3540e\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">What is Boosting?<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-ad5e73e elementor-widget elementor-widget-text-editor\" data-id=\"ad5e73e\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"b6d2\"><strong>Boosting<\/strong>\u00a0is an\u00a0<strong>Ensemble Learning<\/strong>\u00a0technique that, like\u00a0<strong>bagging<\/strong>, makes use of a set of\u00a0<em>base learners<\/em>\u00a0to improve the stability and effectiveness of an <a href=\"https:\/\/www.experfy.com\/blog\/ai-ml\/comparing-different-classification-machine-learning-models-for-an-imbalanced-dataset\/\" target=\"_blank\" rel=\"noreferrer noopener\">ML model<\/a>. The idea behind a\u00a0<strong>boosting<\/strong>\u00a0architecture is the generation of sequential hypotheses, where each hypothesis tries to improve or correct the mistakes made in the previous one [<a href=\"https:\/\/www.cs.princeton.edu\/courses\/archive\/spr07\/cos424\/papers\/boosting-survey.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">4<\/a>]. The central idea of\u00a0<strong>boosting<\/strong>\u00a0is the implementation of\u00a0<em>homogeneous ML algorithms<\/em>\u00a0in a\u00a0<strong>sequential way<\/strong>, where each of these ML algorithms tries to improve the stability of the model by focusing on the errors made by the previous ML algorithm. The way in which the errors of each\u00a0<em>base learner<\/em>\u00a0is considered to be improved with the next\u00a0<em>base learner<\/em>\u00a0in the sequence, is the key differentiator between all variations of the\u00a0<strong>boosting<\/strong>\u00a0technique.<\/p>\n\n<p id=\"17a6\">The <strong>boosting<\/strong> technique has been studied and improved over the years, several variations have been added to the core idea of boosting, some of the most popular are: <strong>AdaBoost<\/strong> (Adaptive Boosting),<strong>Gradient Boosting <\/strong>and <strong>XGBoost <\/strong>(Extreme Gradient Boosting). As mentioned above, the key differentiator between <em>boosting-based techniques<\/em> is the way in which errors are penalized (by modifying <strong><em>weights<\/em><\/strong> or minimizing a <strong>loss function<\/strong>) as well as how the data is sampled.<\/p>\n\n<p id=\"e4a3\">For a better understanding of the differences between some of the <strong>boosting<\/strong> techniques, let\u2019s see in a general way how <strong>AdaBoost<\/strong> and <strong>Gradient Boosting<\/strong> work, two of the most common variations of the boosting technique, let\u2019s go for it!<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-5bbe3db elementor-widget elementor-widget-heading\" data-id=\"5bbe3db\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">AdaBoost<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-d4a2ea2 elementor-widget elementor-widget-text-editor\" data-id=\"d4a2ea2\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"f101\"><strong>AdaBoost<\/strong> is an algorithm based on the <strong>boosting<\/strong> technique, it was introduced in 1995 by Freund and Schapire [<a href=\"https:\/\/www.face-rec.org\/algorithms\/Boosting-Ensemble\/decision-theoretic_generalization.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">5<\/a>]. <strong>AdaBoost<\/strong> implements a <em>vector of weights <\/em>to penalize those samples that were incorrectly inferred (by increasing the weight) and reward those that were correctly inferred (by decreasing the weight). Updating this <em>weight vector <\/em>will generate a distribution where it will be more likely to extract those samples with higher weight (that is, those that were incorrectly inferred), this sample will be introduced to the next <em>base learner<\/em> in the sequence. This will be repeated until a stop criterion is met. Likewise, each base learner in the sequence will have assigned a weight, the higher the performance, the higher the weight and the greater the impact of this base learner for the final decision. Finally, to make a prediction, each base learner in the sequence will be fed with the test data, each of the predictions of each model will be voted (for the classification case) or averaged (for the regression case). In Figure 3 we observe the descriptive architecture of the <strong>AdaBoost<\/strong> operation.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-3530a8d elementor-widget elementor-widget-image\" data-id=\"3530a8d\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t<figure class=\"wp-caption\">\n\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1te-E-A4BRgCirwHXHMXTaw-1024x576.jpeg\" class=\"attachment-large size-large wp-image-18505\" alt=\"Ensemble Learning: Bagging &amp; Boosting\" srcset=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1te-E-A4BRgCirwHXHMXTaw-1024x576.jpeg 1024w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1te-E-A4BRgCirwHXHMXTaw-300x169.jpeg 300w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1te-E-A4BRgCirwHXHMXTaw-768x432.jpeg 768w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1te-E-A4BRgCirwHXHMXTaw-1536x864.jpeg 1536w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1te-E-A4BRgCirwHXHMXTaw-610x343.jpeg 610w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1te-E-A4BRgCirwHXHMXTaw-750x422.jpeg 750w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1te-E-A4BRgCirwHXHMXTaw-1140x641.jpeg 1140w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1te-E-A4BRgCirwHXHMXTaw.jpeg 1920w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/>\t\t\t\t\t\t\t\t\t\t\t<figcaption class=\"widget-image-caption wp-caption-text\">Figure 3. AdaBoost: a descriptive architecture | Image by Author<\/figcaption>\n\t\t\t\t\t\t\t\t\t\t<\/figure>\n\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-f33ad5c elementor-widget elementor-widget-text-editor\" data-id=\"f33ad5c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"a517\"><em>Scikit-learn<\/em> provides the function to implement the <strong>AdaBoost<\/strong> technique, let\u2019s see how to perform a basic implementation:<\/p>\n<p id=\"44e8\">As we can see, the <em>base learner<\/em> that we are using is a <em>decision tree<\/em> (it is suggested that it be a decision tree, however, you can try some other ML algorithm), we are also defining only 5 estimators for the base learner sequence (this is enough for the toy dataset that we are trying to predict), running this we would obtain the following results:<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-00328df elementor-widget elementor-widget-text-editor\" data-id=\"00328df\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<pre class=\"wp-block-preformatted\">Train score: 0.9694835680751174<br>Test score: 0.958041958041958<\/pre>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-06a83fb elementor-widget elementor-widget-text-editor\" data-id=\"06a83fb\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"5e2f\">Great, we already saw in a general way how <strong>AdaBoost <\/strong>works, now let\u2019s see what about <strong>Gradient Boosting<\/strong> and how we can implement it.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-bcbb755 elementor-widget elementor-widget-heading\" data-id=\"bcbb755\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Gradient Boosting<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-5b4171b elementor-widget elementor-widget-text-editor\" data-id=\"5b4171b\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"8c34\">The <strong>Gradient Boosting<\/strong> method does not implement a <em>vector of weights<\/em> like <strong>AdaBoost<\/strong> does. As its name implies, it implements the calculation of the <em>gradient<\/em> for the optimization of a given loss function. The core idea of <strong>Gradient Boosting<\/strong> is based on minimizing the residuals of each learner base in a sequential way, this minimization is carried out through the calculation of the gradient applied to a specific loss function (either for classification or regression). Then each <em>base learner<\/em> added to the sequence will minimize the residuals determined by the previous <em>base learner<\/em>. This will be repeated until the error function is as close to zero or until a specified number of <em>base learners<\/em> is completed. Finally, to make a prediction, each of the <em>base learners<\/em> are fed with the test data whose outputs are parameterized and subsequently added to generate the final prediction.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-f07d70a elementor-widget elementor-widget-image\" data-id=\"f07d70a\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t<figure class=\"wp-caption\">\n\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1GwgguowQ1UzL0HB24KezBw-1024x576.jpeg\" class=\"attachment-large size-large wp-image-18506\" alt=\"Ensemble Learning: Bagging &amp; Boosting\" srcset=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1GwgguowQ1UzL0HB24KezBw-1024x576.jpeg 1024w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1GwgguowQ1UzL0HB24KezBw-300x169.jpeg 300w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1GwgguowQ1UzL0HB24KezBw-768x432.jpeg 768w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1GwgguowQ1UzL0HB24KezBw-1536x864.jpeg 1536w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1GwgguowQ1UzL0HB24KezBw-610x343.jpeg 610w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1GwgguowQ1UzL0HB24KezBw-750x422.jpeg 750w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1GwgguowQ1UzL0HB24KezBw-1140x641.jpeg 1140w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1GwgguowQ1UzL0HB24KezBw.jpeg 1920w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/>\t\t\t\t\t\t\t\t\t\t\t<figcaption class=\"widget-image-caption wp-caption-text\">Figure 4. Gradient Boosting: a descriptive architecture | Image by author<\/figcaption>\n\t\t\t\t\t\t\t\t\t\t<\/figure>\n\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-a311370 elementor-widget elementor-widget-text-editor\" data-id=\"a311370\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"8bf9\">Just like <strong>Bagging<\/strong> and <strong>AdaBoost<\/strong>, <em>scikit-learn<\/em> provides the function to implement <strong>Gradient Boosting<\/strong>, let\u2019s see how to make a basic implementation:<\/p>\n\n<div class='gist '><br \/><\/div>\n\n<p id=\"587f\"><strong>Gradient Boosting<\/strong> works with <em>decision trees<\/em> by default, that is why in the implementation we do not define a specific base learner. We are defining that each tree in the sequence will have a maximum depth of 2, the number of trees will be 5 and the learning rate for each tree will be 0.1, running this we obtain:<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-7e3fdf6 elementor-widget elementor-widget-text-editor\" data-id=\"7e3fdf6\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<pre class=\"wp-block-preformatted\">Train score: 0.9906103286384976<br>Test score: 0.965034965034965<\/pre>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-18ac255 elementor-widget elementor-widget-text-editor\" data-id=\"18ac255\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"603d\">Fantastic, with this we finish this exploration on <strong>bagging<\/strong>, <strong>boosting<\/strong> and some <strong>boosting<\/strong> implementation, that\u2019s it!<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-653215d elementor-widget elementor-widget-heading\" data-id=\"653215d\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Conclusion<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-a836783 elementor-widget elementor-widget-text-editor\" data-id=\"a836783\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"34ab\">In this blog we have seen two of the most widely implemented <strong>Ensemble Learning<\/strong> techniques.<\/p>\n\n<p id=\"98fa\">As we have seen, <strong>bagging<\/strong> is a technique that performs random samples with replacement to train \u201cn\u201d <em>base learners<\/em>, this allows the model to be processed in parallel. It is because of this random sampling that <strong>bagging<\/strong> is a technique that mostly allows to reduce the variance. On the other hand, <strong>boosting<\/strong> is a sequentially constructed technique where each model in the sequence tries to focus on the error of the previous <em>base learner.<\/em> Although <strong>boosting<\/strong> is a technique that mainly allows to reduce the variance, it is highly prone to over-fitting the model.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-b66e9f8 elementor-widget elementor-widget-heading\" data-id=\"b66e9f8\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h4 class=\"elementor-heading-title elementor-size-default\">References<\/h4>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-d2593fc elementor-widget elementor-widget-text-editor\" data-id=\"d2593fc\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"cdc7\">[1] <a href=\"https:\/\/quantdare.com\/what-is-the-difference-between-bagging-and-boosting\/\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/quantdare.com\/what-is-the-difference-between-bagging-and-boosting\/<\/a><\/p>\n\n<p id=\"9047\">[2] <a href=\"https:\/\/arxiv.org\/pdf\/0804.2752.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/arxiv.org\/pdf\/0804.2752.pdf<\/a><\/p>\n\n<p id=\"14a0\">[3] <a href=\"https:\/\/link.springer.com\/article\/10.1023\/A:1018054314350\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/link.springer.com\/article\/10.1023\/A:1018054314350<\/a><\/p>\n\n<p id=\"6ed0\">[4] <a href=\"https:\/\/www.cs.princeton.edu\/courses\/archive\/spr07\/cos424\/papers\/boosting-survey.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/www.cs.princeton.edu\/courses\/archive\/spr07\/cos424\/papers\/boosting-survey.pdf<\/a><\/p>\n\n<p id=\"fd8d\">[5] <a href=\"https:\/\/www.face-rec.org\/algorithms\/Boosting-Ensemble\/decision-theoretic_generalization.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/www.face-rec.org\/algorithms\/Boosting-Ensemble\/decision-theoretic_generalization.pdf<\/a><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>Learn how to combine weak learners to build a stronger learner to reduce bias and variance in your ML model<\/p>\n","protected":false},"author":1030,"featured_media":18507,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[183],"tags":[1275,1276,1157,92],"ppma_author":[3749],"class_list":["post-22579","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-ml","tag-bagging","tag-boosting","tag-ensemble-learning","tag-machine-learning"],"authors":[{"term_id":3749,"user_id":1030,"is_guest":0,"slug":"velasco","display_name":"Fernando Lopez Velasco","avatar_url":"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/Fernando-Lopez-Velasco-150x150.jpeg","author_category":"","user_url":"https:\/\/koneksys.com","last_name":"Lopez Velasco","first_name":"Fernando","job_title":"","description":"Fernando L\u00f3pez Velasco is Deep Learning Engineer at Koneksys, provider of software consulting and research services."}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/22579","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/1030"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=22579"}],"version-history":[{"count":0,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/22579\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/18507"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=22579"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=22579"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=22579"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=22579"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}