{"id":773,"date":"2018-07-03T05:11:35","date_gmt":"2018-07-03T02:11:35","guid":{"rendered":"http:\/\/kusuaks7\/?p=378"},"modified":"2025-11-21T09:11:39","modified_gmt":"2025-11-21T09:11:39","slug":"technical-debt-in-machine-learning","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/ai-ml\/technical-debt-in-machine-learning\/","title":{"rendered":"Technical Debt in Machine Learning"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"773\" class=\"elementor elementor-773\" data-elementor-post-type=\"post\">\n\t\t\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-5f9e5cfa elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"5f9e5cfa\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-5af16340\" data-id=\"5af16340\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-571f13a5 elementor-widget elementor-widget-text-editor\" data-id=\"571f13a5\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<strong><em>Ready to learn Machine Learning? <a href=\"https:\/\/www.experfy.com\/training\/courses\">Browse courses<\/a>\u00a0like\u00a0<a href=\"https:\/\/www.experfy.com\/training\/courses\/machine-learning-foundations-supervised-learning\">Machine Learning Foundations: Supervised Learning<\/a> developed by industry thought leaders and Experfy in Harvard Innovation Lab.<\/em><\/strong>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-ef7d356 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"ef7d356\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-5165d22\" data-id=\"5165d22\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-55bb189 elementor-widget elementor-widget-text-editor\" data-id=\"55bb189\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"dc53\">Many of us frown upon the technical debt but generally, it is not a bad thing. Technical debt is an instrument which is justified when we need to meet some release deadlines or unblock a colleague. The problem with the technical debt though is the same as with the financial debt \u2014 when the time comes to pay the debt we give back more than we took at the beginning. That is because the\u00a0technical debt has a compound effect.<\/p>\n<p id=\"e9ab\"><strong>Experienced teams know when to back up seeing a piling debt, but technical debt in machine learning piles extremely fast.<\/strong>\u00a0<strong>You can create months worth of debt in a matter of one working day and even the most experienced teams can miss a moment when the debt is so huge that it sets them back for half a year, which is often enough to kill a fast-pacing project.<\/strong><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-c10f8ff elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"c10f8ff\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-898b648\" data-id=\"898b648\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-2062308 elementor-widget elementor-widget-text-editor\" data-id=\"2062308\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"0009\">Here are three fantastic papers that explore this issue:\n<a href=\"https:\/\/research.google.com\/pubs\/pub43146.html\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/research.google.com\/pubs\/pub43146.html\" data->Machine Learning: The High Interest Credit Card of Technical Debt<\/a>\u00a0NIPS\u201914\n<a href=\"https:\/\/papers.nips.cc\/paper\/5656-hidden-technical-debt-in-machine-learning-systems.pdf\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/papers.nips.cc\/paper\/5656-hidden-technical-debt-in-machine-learning-systems.pdf\" data->Hidden Technical Debt in Machine Learning Systems<\/a>\u00a0NIPS\u201915\n<a href=\"https:\/\/research.google.com\/pubs\/pub45742.html\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/research.google.com\/pubs\/pub45742.html\" data->What\u2019s your ML test score?<\/a>\u00a0NIPS\u201916<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-df90b69 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"df90b69\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-fc86a89\" data-id=\"fc86a89\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-b0cbad7 elementor-widget elementor-widget-text-editor\" data-id=\"b0cbad7\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"a15c\">These papers categorize and present dozens of machine learning anti-patterns that can slowly creep into your infrastructure creating a time bomb. Here I discuss only three anti-patterns that wake me up at night in a cold sweat and I will leave the rest to the reader.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-e55d7ff elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"e55d7ff\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-8378d7a\" data-id=\"8378d7a\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-65ada2b elementor-widget elementor-widget-heading\" data-id=\"65ada2b\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\"><h3 id=\"65cb\"><strong>Feedback Loops<\/strong><\/h3><\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-18aa5a8 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"18aa5a8\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-dc5d3e3\" data-id=\"dc5d3e3\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-b647ab9 elementor-widget elementor-widget-text-editor\" data-id=\"b647ab9\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"0d9a\">Feedback loops happen when the output of the ML model is indirectly fed into its own input. Sounds like something which is easy to avoid but it is actually not in practice. There are multiple variations of feedback loops and\u00a0<a href=\"https:\/\/research.google.com\/pubs\/pub43146.html\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/research.google.com\/pubs\/pub43146.html\" data->NIPS\u201914 paper<\/a>\u00a0gives a great example but I will give the one which is more real-lify.<\/p>\n<p id=\"32a2\"><strong>Example<\/strong>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-986f267 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"986f267\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-27ab535\" data-id=\"27ab535\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-77b8260 elementor-widget elementor-widget-text-editor\" data-id=\"77b8260\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tLet us say your company has a shopping website. A backend team comes up with a recommender system which decides whether to show a pop-up notification with an offer based on the customer\u2019s profile and the history of past purchases. Naturally, you want to train your recommender system based on the previously clicked or ignored pop-up notifications, which is not a feedback loop, yet. You launch this feature and rejoice as the fraction of clicked notifications slowly grows week-over-week. You explain this growth with an ability of AI to improve on its past performance\u00a0\ud83d\ude42 What you didn\u2019t know though is that the front-end team implemented a fixed threshold which hides pop-up notification if the confidence of a recommended offer is less than 50%, because obviously, they do not want to show potentially bad offers to the customers. As the time passes, the recommendations that would previously be in the 50\u201360% confidence range are now inferred with the &lt;50% confidence, leaving only the most potent recommendations in the 50\u2013100% bracket. That is a feedback loop \u2014 your metric grows but the quality of the system does not improve. Morale: you should not only\u00a0<strong><em>exploit<\/em><\/strong>\u00a0the ML system but also allow it to\u00a0<strong><em>explore<\/em><\/strong> \u2014 get rid of the fixed threshold.<\/p>\n<p id=\"507f\">In small companies, it is relatively easy to control the feedback loops, but in large companies with dozens of teams working on dozens of complex systems piped into each other some of the feedback loops are very likely to be missed.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-0d3c268 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"0d3c268\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-53adc67\" data-id=\"53adc67\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-732c57d elementor-widget elementor-widget-image\" data-id=\"732c57d\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img fetchpriority=\"high\" decoding=\"async\" width=\"800\" height=\"450\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2018\/07\/1_8-BjqR45cFwD1SlwcBAi0g.jpg\" class=\"attachment-large size-large wp-image-38140\" alt=\"\" srcset=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2018\/07\/1_8-BjqR45cFwD1SlwcBAi0g.jpg 800w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2018\/07\/1_8-BjqR45cFwD1SlwcBAi0g-300x169.jpg 300w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2018\/07\/1_8-BjqR45cFwD1SlwcBAi0g-768x432.jpg 768w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2018\/07\/1_8-BjqR45cFwD1SlwcBAi0g-610x343.jpg 610w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2018\/07\/1_8-BjqR45cFwD1SlwcBAi0g-750x422.jpg 750w\" sizes=\"(max-width: 800px) 100vw, 800px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-3f1ae16 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"3f1ae16\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-fac616d\" data-id=\"fac616d\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-03ed383 elementor-widget elementor-widget-text-editor\" data-id=\"03ed383\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"82e5\">Feedback loops can be smelled if you notice that some of your metrics slowly drift upwards with the time even when there are no launches. Finding and fixing the loop is a much harder problem since it involves a directed cross-team effort.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-9b9afd6 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"9b9afd6\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-9bf58d5\" data-id=\"9bf58d5\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-b221a4d elementor-widget elementor-widget-image\" data-id=\"b221a4d\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" width=\"800\" height=\"303\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2018\/07\/1_jEDiO-0QhitphWWNKv87Kg.jpg\" class=\"attachment-large size-large wp-image-38141\" alt=\"\" srcset=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2018\/07\/1_jEDiO-0QhitphWWNKv87Kg.jpg 800w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2018\/07\/1_jEDiO-0QhitphWWNKv87Kg-300x114.jpg 300w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2018\/07\/1_jEDiO-0QhitphWWNKv87Kg-768x291.jpg 768w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2018\/07\/1_jEDiO-0QhitphWWNKv87Kg-610x231.jpg 610w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2018\/07\/1_jEDiO-0QhitphWWNKv87Kg-750x284.jpg 750w\" sizes=\"(max-width: 800px) 100vw, 800px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-95ea8d6 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"95ea8d6\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-ada1dad\" data-id=\"ada1dad\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-b92758a elementor-widget elementor-widget-heading\" data-id=\"b92758a\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\"><h3 id=\"d39d\"><strong>Correction Cascades<\/strong><\/h3>\n<\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-31ee600 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"31ee600\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-245a159\" data-id=\"245a159\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-1cbb217 elementor-widget elementor-widget-text-editor\" data-id=\"1cbb217\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"be9d\">Correction cascades happen when the ML model does not learn the thing that you want it to learn and you end up applying a hotfix on the output of the ML model. As the hotfixes pile-up you end up with a thick layer of heuristics on the top of the ML model which is called a correction cascade. Correction cascades are extremely tempting even in the absence of time pressure. It is easy to apply a filter to the output of the ML system in order to take care of some rare special cases that ML does not want to learn.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-a8d01fb elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"a8d01fb\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-1f1af2a\" data-id=\"1f1af2a\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-917647a elementor-widget elementor-widget-image\" data-id=\"917647a\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" width=\"800\" height=\"430\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2018\/07\/1_D33Ygo7pmMJoL5bUSdWqSQ.jpg\" class=\"attachment-large size-large wp-image-38142\" alt=\"\" srcset=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2018\/07\/1_D33Ygo7pmMJoL5bUSdWqSQ.jpg 800w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2018\/07\/1_D33Ygo7pmMJoL5bUSdWqSQ-300x161.jpg 300w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2018\/07\/1_D33Ygo7pmMJoL5bUSdWqSQ-768x413.jpg 768w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2018\/07\/1_D33Ygo7pmMJoL5bUSdWqSQ-610x328.jpg 610w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2018\/07\/1_D33Ygo7pmMJoL5bUSdWqSQ-750x403.jpg 750w\" sizes=\"(max-width: 800px) 100vw, 800px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-8cce93b elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"8cce93b\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-c8ddb7e\" data-id=\"c8ddb7e\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-b3b5c27 elementor-widget elementor-widget-text-editor\" data-id=\"b3b5c27\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"f7b7\">Correction cascades decorrelate the metrics that your ML model tries to optimize while training from the overall metrics of the entire system. As this layer grows thicker you can no longer figure out what changes to the ML model would improve the final metrics that you present to your boss and you end up not being able to deliver the new improvements.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-08af128 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"08af128\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-bf0675b\" data-id=\"bf0675b\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-2c832d7 elementor-widget elementor-widget-heading\" data-id=\"2c832d7\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\"><h3 id=\"419d\"><strong>Hobo-features<\/strong><\/h3>\n<\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-1a8b6ff elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"1a8b6ff\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-ce88710\" data-id=\"ce88710\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-8c99407 elementor-widget elementor-widget-text-editor\" data-id=\"8c99407\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"a95d\">Hobo-features are features that do nothing useful in your ML system and you cannot get rid of them. There are three types of hobo-features:<\/p>\n<p id=\"2785\"><strong>Bundled features<\/strong>\nSometimes when we have a group of new features you evaluate them together and if found beneficial submit the entire bundle. Unfortunately, only some of the features in the bundle can be useful while other features are dragging it down.<\/p>\n<p id=\"355b\"><strong>\u03b5-Features<\/strong>\nIt is tempting to sometimes add a feature even when the quality increase is very minor. Such features, however, may become neutral or negative in a week if the underlying data drifts a bit.<\/p>\n<p id=\"3bbf\"><strong>Legacy features<\/strong>\nAs times goes on, we add new features to a project and never reevaluate them again. In a couple of months, some of these features may become totally useless or superseded by the new features.<\/p>\n<p id=\"197d\">In a complex ML system, the only way to efficiently weed out the hobo-features is to try pruning them one at a time. Meaning, you remove one feature at a time, train the ML system, and evaluate it using your metrics. If the system takes 1 day to train, we can run at most 5 trainings at a time, and we have 500 features, then pruning all of them will take us 100 days. Unfortunately features may interact which means you have to try pruning all possible subsets of features, which becomes an exponentially hard problem.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-371581f elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"371581f\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-660ece2\" data-id=\"660ece2\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-19ff1bf elementor-widget elementor-widget-heading\" data-id=\"19ff1bf\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\"><h3 id=\"f93b\">With Our Powers\u00a0Combined<\/h3>\n<\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-1362370 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"1362370\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-0aa94aa\" data-id=\"0aa94aa\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-1bbaff4 elementor-widget elementor-widget-text-editor\" data-id=\"1bbaff4\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"0af2\">Having all three anti-patterns in your machine learning infrastructure can be an instakill to the entire project.<\/p>\n<p id=\"0b7f\">With the feedback loops, your metrics won\u2019t reflect the real quality of the system and your ML model will learn to exploit these feedback loops instead of learning useful things. Additionally, as the time goes your model may be unintentionally shaped by the engineering team to exploit these loops even more.<\/p>\n<p id=\"743d\">The correction cascades will loosen correlation between the metrics measured directly on the ML model and a system as a whole. You will end up in a situation where positive improvements to the ML model have a random effect on the metrics of the overall system.<\/p>\n<p id=\"6677\">With hobo-features you won\u2019t even know which of your hundreds of features actually carry the useful information and it will be too expensive to prune them. On a daily basis, the metrics that you commonly monitor will randomly jump up or drop down because some of the garbage features will randomly seizure. And no, regularization helps just a little bit.<\/p>\n<p id=\"e921\">You end up with the project where the metrics randomly jump up or down, do not reflect the actual quality, and you are not able to improve them. The only way out would be to rewrite the entire project from the scratch. That is when you know \u2014 you shot yourself in the foot with a bazooka.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>Experienced teams know when to back up seeing a piling debt, but technical debt in machine learning piles extremely fast.&nbsp;You can create months worth of debt in a matter of one working day and even the most experienced teams can miss a moment when the debt is so huge that it sets them back for half a year, which is often enough to kill a fast-pacing project. You end up with the project where the metrics randomly jump up or down, do not reflect the actual quality, and you are not able to improve them.&nbsp;<\/p>\n","protected":false},"author":306,"featured_media":4115,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"content-type":"","footnotes":""},"categories":[183],"tags":[97],"ppma_author":[1943],"class_list":["post-773","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-ml","tag-artificial-intelligence"],"authors":[{"term_id":1943,"user_id":306,"is_guest":0,"slug":"maksym-zavershynskyi","display_name":"Maksym Zavershynskyi","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/?s=96&d=mm&r=g","user_url":"","last_name":"Zavershynskyi","first_name":"Maksym","job_title":"","description":"Maksym Zavershynskyi is Software Engineer at Google"}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/773","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/306"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=773"}],"version-history":[{"count":4,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/773\/revisions"}],"predecessor-version":[{"id":38145,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/773\/revisions\/38145"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/4115"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=773"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=773"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=773"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=773"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}