{"id":22525,"date":"2020-12-24T11:55:12","date_gmt":"2020-12-24T11:55:12","guid":{"rendered":"https:\/\/www.experfy.com\/blog\/what-to-do-when-model-doesnt-work\/"},"modified":"2023-09-13T17:47:28","modified_gmt":"2023-09-13T17:47:28","slug":"what-to-do-when-model-doesnt-work","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/ai-ml\/what-to-do-when-model-doesnt-work\/","title":{"rendered":"What To Do When \u201cThe Model Doesn\u2019t Work\u201d?"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"22525\" class=\"elementor elementor-22525\" data-elementor-post-type=\"post\">\n\t\t\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-b74a8f8 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"b74a8f8\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-fef95ae\" data-id=\"fef95ae\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-9ca9836 elementor-widget elementor-widget-text-editor\" data-id=\"9ca9836\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Your team has worked for months to gather data, built a predictive model, create a user interface, and deploy a new machine learning product with some early customers. But instead of celebrating victory, you\u2019re now hearing grumbling from the account managers for those early adopter customers that they\u2019re not happy with the prediction accuracy they are seeing and starting to think that \u201cthe model doesn\u2019t work\u201d. What do you do now?<\/p>\n\n<p>This is a situation we often&nbsp;see at Pattern Labs in working with organizations that are implementing machine learning in new products. And it\u2019s not an easy one to resolve quickly. The performance of real-world machine learning models is influenced by a large number of factors, some of which may be under your control and others which may not. On top of that, when trying to model real-world phenomenon, every modeling problem has an inherent amount of noise\/randomness mixed in with the signal, making it difficult to understand the degree of accuracy one can truly expect to achieve with a predictive model. Add in the customer expectations in terms of the accuracy they expect to get with your model, and all of a sudden your data science team is stuck in a tricky situation trying to figure out where to even begin to solve the problem.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-7ac95a7 elementor-widget elementor-widget-heading\" data-id=\"7ac95a7\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\">1) Understanding the Problem to Solve<\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-7bb5e73 elementor-widget elementor-widget-text-editor\" data-id=\"7bb5e73\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>The first place to start is to make sure the team has a good understanding of the customer problem they are trying to solve with the model. It is amazing how often the data science team\u2019s understanding of what defines success differs from the customers\u2019 criteria. We worked recently with a company that was trying to predict the impact of severe weather on a utility\u2019s operations. The technical team was beating their heads against a wall trying to improve a MAPE score of their model. When we dug into it, it turned out that MAPE wasn\u2019t the right metric to use at all, and the target they were striving for was one that they set themselves (vs. listening to the customer). What the customer actually cared most about was our ability to consistently classify the storm in a 1\u20135 impact severity range that they had defined for their operating procedures.<\/p>\n\n<p>Ensuring that the<a href=\"https:\/\/www.experfy.com\/blog\/bigdata-cloud\/data-science-in-the-real-world\/\" target=\"_blank\" rel=\"noreferrer noopener\"> data science<\/a> team has a thorough understanding of the problem, and preferably hear it first-hand from customers themselves, is critical to the success of a new initiative. If your team is getting stuck in the situation described above where the model doesn\u2019t work, step one is to go back and ensure you have properly defined the problem and understand how your customer measures success.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-e08a6fc elementor-widget elementor-widget-heading\" data-id=\"e08a6fc\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\">2) Is the data right, and complete?<\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-af1c406 elementor-widget elementor-widget-text-editor\" data-id=\"af1c406\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>The next step is to go back and look at the input data your team has collected. More often than not, when working with complex real-world models the primary reason for lack of model performance is due to issues with the input dataset and features, rather than the model itself. Particularly if you are running multiple types of models (and we would advise you to do so whenever possible) and getting similar results, it\u2019s often a sign that your input dataset is holding you back.<\/p>\n\n<p>A key part of this step is to ensure you have collected as much relevant data as you possibly can. Often there are contributing factors to real-world patterns that are not always intuitive or obvious, and so the more data and features you can collect, the better. There are plenty of techniques at your disposal to downselect the feature data to build a model on the most relevant features, which we\u2019ll discuss in the next step. But for this step, the focus is on re-visiting your assumptions around which inputs drive the output you are trying to model, and going back for additional data if needed. For example, when trying to model a real-world phenomenon there are often unobvious factors that need to be taken into account due to their impact on trends and especially on outlier cases \u2014 such as seasonality, weather, calendar events, and even geo-political happenings.<\/p>\n\n<p>Secondly, some simple QA checks should be put in place to ensure that the input data is getting mapped and processed correctly. We recently worked with a client who was struggling with model performance, only to eventually uncover that the issue was not with the model at all \u2014 the client was incorrectly processing some of the geo-located feature data which prevented the models they were running from identifying correct patterns.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-8d95458 elementor-widget elementor-widget-image\" data-id=\"8d95458\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t<figure class=\"wp-caption\">\n\t\t\t\t\t\t\t\t\t\t<img fetchpriority=\"high\" decoding=\"async\" width=\"480\" height=\"480\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1bY6G91CNzhJBXXuM-wXukA.png\" class=\"attachment-large size-large wp-image-18265\" alt=\"What To Do When \u201cThe Model Doesn\u2019t Work\u201d?\" srcset=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1bY6G91CNzhJBXXuM-wXukA.png 480w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1bY6G91CNzhJBXXuM-wXukA-300x300.png 300w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1bY6G91CNzhJBXXuM-wXukA-150x150.png 150w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1bY6G91CNzhJBXXuM-wXukA-75x75.png 75w\" sizes=\"(max-width: 480px) 100vw, 480px\" \/>\t\t\t\t\t\t\t\t\t\t\t<figcaption class=\"widget-image-caption wp-caption-text\">CRISP-DM Process. Source:&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Cross-industry_standard_process_for_data_mining#\/media\/File:CRISP-DM_Process_Diagram.png\" target=\"_blank\" rel=\"nofollow noopener\">https:\/\/en.wikipedia.org\/wiki\/Cross-industry_standard_process_for_data_mining#\/media\/File:CRISP-DM_Process_Diagram.png<\/a><\/figcaption>\n\t\t\t\t\t\t\t\t\t\t<\/figure>\n\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-fbc4d52 elementor-widget elementor-widget-text-editor\" data-id=\"fbc4d52\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>The CRISP-DM process is one of the most common frameworks followed by many data science teams for managing projects. We like it for its focus on ensuring business and data understanding prior to diving into the modeling. Two of the key steps in the CRISP-DM process are \u201cdata understanding\u201d and \u201cdata preparation\u201d. Properly following these steps involves an in-depth dive into the input data to really understand it, often aided by visualizations of distributions, trends, and relationships within the data. And \u201cdata preparation\u201d often involves pre-processing, data augmentation, and\/or normalization to prepare for modeling. Done properly, these two steps help the data scientist ensure that mistakes in input data are not to blame for any model performance issues he\/she later encounter.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-2273996 elementor-widget elementor-widget-heading\" data-id=\"2273996\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\">3) Model adjustments to dial in performance<\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-d6e0481 elementor-widget elementor-widget-text-editor\" data-id=\"d6e0481\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Now that you\u2019ve validated the input data as correct and as complete as possible, it\u2019s time to focus on the fun stuff, the modeling itself. One of the highest impact parts of this step is the feature selection \u2014 downselecting the key features that most impact the output and training your model on those, eliminating redundant or highly correlated features to both speed up and increase model accuracy. There are several good blog posts out there on features selection techniques including univariate selection, recursive feature elimination, and random forest feature importance. Here is one for reference:&nbsp;<a href=\"https:\/\/machinelearningmastery.com\/feature-selection-machine-learning-python\/\" target=\"_blank\" rel=\"noreferrer noopener\" class=\"broken_link\">https:\/\/machinelearningmastery.com\/feature-selection-machine-learning-python\/<\/a>. Whichever technique you employ, or a combination of them all, make sure to spend time on this step to get the optimal combination of features for your model.<\/p>\n\n<p>Another important part of this step is to revisit your choice of model, or considering adding additional model types or ensembling multiple models. Again there are many good articles comparing the pros and cons of different models, but we recommend running at least two model types when possible (ideally one being a neural net) to compare results.<\/p>\n\n<p>And finally, once you have your features and your model selection, re-run your hyperparameter tuning ensuring you have properly defined your training, validation and test sets in a way that you are not \u201ccheating\u201d when tuning your model so that it generalizes well to new data rather than tuning it so tightly on the training set that it overfits and does a poor job in practice with new data.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-652d0f2 elementor-widget elementor-widget-heading\" data-id=\"652d0f2\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\">4) Finally, and most importantly, manage your customers\u2019 expectations<\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-34b2830 elementor-widget elementor-widget-text-editor\" data-id=\"34b2830\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>This is another critical step which many data scientists ignore, thinking it is \u201cnot their job\u201d. When launching a new-to-the-world machine learning product, there is a fair amount of uncertainty around the performance of the model in the wild. Furthermore, as noted above the amount of noise that occurs in the real world around the problem you are solving may set limitations on your model\u2019s performance despite your best efforts to maximize accuracy. It is part of the responsibility of the data science team to work hand-in-hand with product managers, salespeople, and customer success to define the message to the customer on the performance they can expect to see from the model, but also to educate them on how the model will improve over time with additional data to train on.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-0fc9f6b elementor-widget elementor-widget-image\" data-id=\"0fc9f6b\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" width=\"877\" height=\"384\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1emhtWklxC9h_EjiAn7iZBQ.png\" class=\"attachment-large size-large wp-image-18266\" alt=\"What To Do When \u201cThe Model Doesn\u2019t Work\u201d?\" srcset=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1emhtWklxC9h_EjiAn7iZBQ.png 877w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1emhtWklxC9h_EjiAn7iZBQ-300x131.png 300w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1emhtWklxC9h_EjiAn7iZBQ-768x336.png 768w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1emhtWklxC9h_EjiAn7iZBQ-610x267.png 610w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1emhtWklxC9h_EjiAn7iZBQ-750x328.png 750w\" sizes=\"(max-width: 877px) 100vw, 877px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-86bb418 elementor-widget elementor-widget-text-editor\" data-id=\"86bb418\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>So the next time your team runs into performance challenges when releasing a new model into the wild, rather than playing the blame game or jumping right into adjusting model hyperparameters to optimize fit, take a step back and follow this simple, structured process to work through the problem step-by-step and maximize the probability of success with your new model.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>Your team has worked for months to gather data, built a predictive model, create a user interface, and deploy a new machine learning product with some early customers. But instead of celebrating victory, you\u2019re now hearing grumbling from the account managers for those early adopter customers that they\u2019re not happy with the prediction accuracy they<\/p>\n","protected":false},"author":995,"featured_media":18267,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"content-type":"","footnotes":""},"categories":[183],"tags":[92,1181,1182],"ppma_author":[3862],"class_list":["post-22525","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-ml","tag-machine-learning","tag-predictive-model","tag-user-interface"],"authors":[{"term_id":3862,"user_id":995,"is_guest":0,"slug":"jon-reifschneider","display_name":"Jon Reifschneider","avatar_url":"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/Jon-Reifschneider-150x150.jpeg","user_url":"https:\/\/ai.meng.duke.edu\/","last_name":"Reifschneider","first_name":"Jon","job_title":"","description":"Jon Reifschneider is Executive in Residence at Duke University\u2019s Pratt School of Engineering where he serves as the Director of Masters Studies of the Artificial Intelligence for Product Innovation (AIPI) program and teaches in it."}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/22525","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/995"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=22525"}],"version-history":[{"count":4,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/22525\/revisions"}],"predecessor-version":[{"id":32946,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/22525\/revisions\/32946"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/18267"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=22525"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=22525"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=22525"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=22525"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}