{"id":1544,"date":"2019-02-28T03:00:10","date_gmt":"2019-02-28T03:00:10","guid":{"rendered":"http:\/\/kusuaks7\/?p=1149"},"modified":"2023-06-29T12:06:10","modified_gmt":"2023-06-29T12:06:10","slug":"four-most-important-success-factors-in-any-machine-learning-project","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/ai-ml\/four-most-important-success-factors-in-any-machine-learning-project\/","title":{"rendered":"Four Most Important Success Factors in any Machine Learning Project"},"content":{"rendered":"<p>If you are a&nbsp;<strong>product manager<\/strong>&nbsp;and want to build something with machine learning, here&rsquo;s a list of the 4 most important things to keep in mind:<\/p>\n<h2>1. Prioritise engineering over data&nbsp;science<\/h2>\n<p style=\"text-align: center;\"><img decoding=\"async\" data-li-src=\"https:\/\/media.licdn.com\/dms\/image\/C5612AQEq8gldEEaEvQ\/article-inline_image-shrink_1500_2232\/0?e=1556755200&amp;v=beta&amp;t=zu9mgNqc5HtRfoV9zgmJnBLh-gz4gH0D2i9zhUxjkg4\" data-media-urn=\"\" src=\"https:\/\/media.licdn.com\/dms\/image\/C5612AQEq8gldEEaEvQ\/article-inline_image-shrink_1500_2232\/0?e=1556755200&amp;v=beta&amp;t=zu9mgNqc5HtRfoV9zgmJnBLh-gz4gH0D2i9zhUxjkg4\" style=\"width: 700px; height: 323px;\" \/><\/p>\n<p>&nbsp;<\/p>\n<p>A machine learning project is first and foremost a software project. Many data scientists have little experience building&nbsp;<strong>well architected, reliable and easy to deploy software<\/strong>. When you build a production system, this will become a problem.<\/p>\n<p>As a rule of thumb,&nbsp;<strong>engineers can pick up data science skills faster than data scientists can pick up engineering experience.<\/strong>&nbsp;If in doubt, work with the python engineer with 5+ years experience and a passion for AI, rather than the PhD in data science who is having their first go at building business applications.<\/p>\n<h2>2. Go&nbsp;lean<\/h2>\n<p>It&rsquo;s important to reduce risks early. Structure your project with concrete milestones:<\/p>\n<ol>\n<li><strong>Finished Prototype<\/strong>: Find out whether your idea is promising&nbsp;<em>1 day &mdash; 2 weeks<\/em><\/li>\n<li><strong>Offline tested system<\/strong>: Tune the model and rigorously test it on existing data&nbsp;<em>2&ndash;4 weeks<\/em><\/li>\n<li><strong>Online tested system<\/strong>: Finalise the model and test it live<em>&nbsp;2&ndash;4 weeks<\/em><\/li>\n<li><strong>Going live:<\/strong>&nbsp;Automate data updates, model training and code deployment:<em>2&ndash;4 weeks<\/em><\/li>\n<li><strong>Continuous improvement:&nbsp;<\/strong>(optional)<strong>&nbsp;<\/strong><em>12 months<\/em><\/li>\n<\/ol>\n<p><strong>Total timeline: 1&ndash;3 months<\/strong><\/p>\n<p>An experienced team should be able to follow these timelines for almost any project. Focus the team on setting up a live system in 1&ndash;3 months. After it&rsquo;s live, then decide whether further improvements are worth it.<\/p>\n<h3>These temptations can prolong your project unnecessarily:<\/h3>\n<ul>\n<li>Waiting for the perfect data<\/li>\n<li>Using the wrong tools (too complex or too slow)<\/li>\n<li>Overengineering for scalability<\/li>\n<li>Endlessly playing with the algorithms (see next point)<\/li>\n<\/ul>\n<h2>3. The algorithm doesn&rsquo;t&nbsp;matter<\/h2>\n<p style=\"text-align: center;\"><img decoding=\"async\" data-li-src=\"https:\/\/media.licdn.com\/dms\/image\/C5612AQEl9xo5uq8ThQ\/article-inline_image-shrink_1500_2232\/0?e=1556755200&amp;v=beta&amp;t=w6n-Fb9_cz2GkY4fG8GHdVBRfcgsvOdJ1lgAC5ch7pQ\" data-media-urn=\"\" src=\"https:\/\/media.licdn.com\/dms\/image\/C5612AQEl9xo5uq8ThQ\/article-inline_image-shrink_1500_2232\/0?e=1556755200&amp;v=beta&amp;t=w6n-Fb9_cz2GkY4fG8GHdVBRfcgsvOdJ1lgAC5ch7pQ\" style=\"width: 700px; height: 445px;\" \/><\/p>\n<p>&nbsp;<\/p>\n<p>Machine learning systems have lots of fascinating knobs you can play with. Don&rsquo;t.<\/p>\n<p>The improvements that are worth spending time on (in order of importance):<\/p>\n<ol>\n<li>Get more (relevant) input data<\/li>\n<li>Preprocess the data in a better way<\/li>\n<li>Choose the right algorithm and tune it correctly.<\/li>\n<\/ol>\n<p><strong>The algorithm is the least important factor<\/strong>. Simply choose an algorithm that works. Endlessly upgrading the algorithm is tempting, but it will probably not give you the results you expect.<\/p>\n<h2>4. Communicate, communicate, communicate<\/h2>\n<p style=\"text-align: center;\"><img decoding=\"async\" data-li-src=\"https:\/\/media.licdn.com\/dms\/image\/C5612AQF_iNv7Wz0qmg\/article-inline_image-shrink_1500_2232\/0?e=1556755200&amp;v=beta&amp;t=TsvtEqq14WQ3lZfTnp7p3BEn_EzKK8vCqjTGSHE_3YU\" data-media-urn=\"\" src=\"https:\/\/media.licdn.com\/dms\/image\/C5612AQF_iNv7Wz0qmg\/article-inline_image-shrink_1500_2232\/0?e=1556755200&amp;v=beta&amp;t=TsvtEqq14WQ3lZfTnp7p3BEn_EzKK8vCqjTGSHE_3YU\" style=\"width: 700px; height: 354px;\" \/><\/p>\n<p>&nbsp;<\/p>\n<h3>Share as much of the business context as possible:<\/h3>\n<p>Once the engineering team starts building, they have to make a lot of choices. The better they know your priorities, the more they can make the right decisions. You should at least tell them about:<\/p>\n<ul>\n<li><strong>Strategic priorities<\/strong><\/li>\n<\/ul>\n<p>Is this fixing a critical issue? Will it need to work for millions of requests a day? Or is it research for a future product?<\/p>\n<ul>\n<li><strong>Problems with the current process<\/strong><\/li>\n<\/ul>\n<p>Does the current process take too long? Is it too inaccurate? Or is there a lot of data that simply can&rsquo;t be taken into account without machine learning?<\/p>\n<ul>\n<li><strong>Inputs and outputs<\/strong><\/li>\n<\/ul>\n<p>Inputs: What data would you (as a human) use to make the right decisions?<\/p>\n<p>Outputs: Who will consume the output? How frequently? Does it need to be real time?<\/p>\n<ul>\n<li><strong>Performance metrics<\/strong><\/li>\n<\/ul>\n<p>What are the most important metrics: Click through rate? Sales? ROI? False positive rate?<\/p>\n<ul>\n<li><strong>Expected accuracy<\/strong><\/li>\n<\/ul>\n<p>If you want to optimise conversion rates, then it might not be worth another 2 weeks of tuning to get 2% more accuracy.<\/p>\n<p>If you build medical diagnostic systems, then false negatives of even 1% can be unacceptable.<\/p>\n<h2>TL;DR<\/h2>\n<ul>\n<li>Prioritize engineering over data science.<\/li>\n<li>Reduce risks by going lean.<\/li>\n<li>Don&rsquo;t get distracted by the algorithm.<\/li>\n<li>Share all business requirements with your developers.<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>A machine learning project is first and foremost a software project. Many data scientists have little experience building&nbsp;well architected, reliable and easy to deploy software. When you build a production system, this will become a problem. As a rule of thumb,&nbsp;engineers can pick up data science skills faster than data scientists can pick up engineering experience.&nbsp;If in doubt, work with the python engineer with 5+ years experience and a passion for AI. If you are a&nbsp;product manager&nbsp;and want to build something with machine learning, here&rsquo;s a list of the 4 most important things to keep in mind.<\/p>\n","protected":false},"author":314,"featured_media":3995,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"content-type":"","footnotes":""},"categories":[183],"tags":[92],"ppma_author":[2069],"class_list":["post-1544","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-ml","tag-machine-learning"],"authors":[{"term_id":2069,"user_id":314,"is_guest":0,"slug":"markus-schmitt","display_name":"Markus Schmitt","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/?s=96&d=mm&r=g","user_url":"","last_name":"Schmitt","first_name":"Markus","job_title":"","description":"Markus Schmitt is the founder and head of data science at Data Revenue, a Machine Learning Agency based in Berlin, Germany, where he builds custom end-to-end machine learning systems for Medical, Finance and Marketing clients. Before Data Revenue he developed new ventures for the company builder Team Europe and studied Mathematics &amp; Economics at Warwick."}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1544","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/314"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=1544"}],"version-history":[{"count":2,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1544\/revisions"}],"predecessor-version":[{"id":28966,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1544\/revisions\/28966"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/3995"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=1544"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=1544"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=1544"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=1544"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}