{"id":1866,"date":"2019-08-06T03:00:00","date_gmt":"2019-08-06T03:00:00","guid":{"rendered":"http:\/\/kusuaks7\/?p=1471"},"modified":"2024-07-19T13:37:03","modified_gmt":"2024-07-19T13:37:03","slug":"the-surprising-truth-about-what-it-takes-to-build-a-machine-learning-product","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/ai-ml\/the-surprising-truth-about-what-it-takes-to-build-a-machine-learning-product\/","title":{"rendered":"The Surprising Truth About What it Takes to Build a Machine Learning Product"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"1866\" class=\"elementor elementor-1866\" data-elementor-post-type=\"post\">\n\t\t\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-544cd5b6 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"544cd5b6\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-6c856362\" data-id=\"6c856362\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-5bedffc9 elementor-widget elementor-widget-text-editor\" data-id=\"5bedffc9\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"7266\" data-selectable-paragraph=\"\">When I was in college, an ice cream shop opened nearby, and a few friends and I went to check it out. We walked in, and it looked completely normal \u2014 they had all the usual flavors like mint, chocolate, and the like. However, at the end of the counter, they had this flavor called \u201cThe Broccoli Surprise\u201d. A naturally curious individual, I had to try it. I asked the attendant behind the counter for a sample. It was white with little green specks, and it tasted sweet, creamy, and rich. I was confused \u2014 there was no broccoli flavor in here. So I asked, \u201cwhat\u2019s the surprise?\u201d \u201cThere\u2019s no broccoli,\u201d she replied with a smile.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-6038c67 elementor-widget elementor-widget-text-editor\" data-id=\"6038c67\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"c892\" data-selectable-paragraph=\"\">Machine learning (ML) has a surprise, too. One of the biggest misconceptions about ML deployment within organizations is comprehending the difficulty and the value.<\/p>\n<p id=\"bee1\" data-selectable-paragraph=\"\">Integrating ML into your business workflows can be broken down into five activities:<\/p>\n<p id=\"637e\" data-selectable-paragraph=\"\"><strong>Defining KPIs<\/strong>\u00a0\u2014 Key Performance Indicators allow us to measure and discuss what we are trying to improve. Common KPIs include customer retention, manufacturing yield, or employee turnover. Setting KPIs is a critical step in Machine Learning since they ultimately drive optimization along the way to a performant model.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-05b46ef elementor-widget elementor-widget-text-editor\" data-id=\"05b46ef\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"e8fc\" data-selectable-paragraph=\"\"><strong>Collecting Data<\/strong>\u00a0\u2014 Collecting the data that will be used to train your ML algorithms. Yes, you could use ML models others have produced if you lack data. However those business considerations are similar to other SaaS offerings, so let\u2019s leave them out of scope here.<\/p>\n<p id=\"7920\" data-selectable-paragraph=\"\"><strong>Infrastructure<\/strong>\u00a0\u2014 ML infrastructure includes various pieces of software: data management, annotation tools, model training, and testing environments. This infrastructure is an upfront investment, but makes iterating and improving the model and data set much more efficient.<\/p>\n<p id=\"6755\" data-selectable-paragraph=\"\"><strong>Optimizing ML Algorithm<\/strong>\u00a0\u2014 Here we consider factors like which model to use based on a given data set\/problem, the amount of necessary training data, the layers in your neural net, and hyperparameter tuning. There are a plethora of choices.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-2efa349 elementor-widget elementor-widget-text-editor\" data-id=\"2efa349\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"ad93\" data-selectable-paragraph=\"\"><strong>Integration<\/strong>\u00a0\u2014 Getting an ML model working in a vacuum is a great achievement, but it is not until the model is integrated with a real workflow that it starts to create a tangible business impact. Integration is the process of building pipes and structure which seamlessly pass information and data between users and computers.<\/p>\n<p id=\"bbe6\" data-selectable-paragraph=\"\">Based on many conversations with companies interested in deploying machine learning, there is a high perceived effort required in, and pay off from, optimizing a machine learning algorithm.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-7527d2b elementor-widget elementor-widget-image\" data-id=\"7527d2b\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/miro.medium.com\/max\/700\/0*j2zj4aFiMq6uhM32\" alt=\"\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-dc1dde8 elementor-widget elementor-widget-text-editor\" data-id=\"dc1dde8\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"e07a\" data-selectable-paragraph=\"\">There are a few possible reasons for this:<\/p>\n\n<ul>\n \t<li id=\"99ec\" data-selectable-paragraph=\"\">For most practitioners, optimizing ML models is the biggest \u201cunknown\u201d in the stack, so it\u2019s easy to imagine it being more complicated and time-consuming than it really is.<\/li>\n \t<li id=\"a297\" data-selectable-paragraph=\"\">Availability Heuristic \u2014 since ML algorithms and optimization are talked about more in literature and media, it is common for people to assume that they play larger roles than they do in the actual implementation process.<\/li>\n<\/ul>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-d3ed4d1 elementor-widget elementor-widget-heading\" data-id=\"d3ed4d1\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h1 class=\"elementor-heading-title elementor-size-default\"><h1 id=\"7ba9\" data-selectable-paragraph=\"\">The Surprise<\/h1><\/h1>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-ff545f4 elementor-widget elementor-widget-text-editor\" data-id=\"ff545f4\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"4306\" data-selectable-paragraph=\"\">When I talk to practitioners that have had a lot of experience building and scaling these ML systems inside Google, I hear a very different story. Based on these conversations, optimizing an ML algorithm takes much less relative effort, but\u00a0<strong>collecting data<\/strong>,\u00a0<strong>building infrastructure,\u00a0<\/strong>and\u00a0<strong>integration\u00a0<\/strong>each<strong>\u00a0<\/strong>take much more work. The differences between expectations and reality are profound.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-681323d elementor-widget elementor-widget-image\" data-id=\"681323d\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/miro.medium.com\/max\/700\/0*JPKydbSB9YRzy8k6\" alt=\"\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-c37e54f elementor-widget elementor-widget-text-editor\" data-id=\"c37e54f\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"f356\" data-selectable-paragraph=\"\"><strong>Defining KPIs<\/strong>\u00a0\u2014 once we deploy data-driven systems, we spend less time and organizational resources selecting KPIs since there are constant streams of data feedback. This obviates the need for proxy KPIs. Since good ML is predicated on good data, we must have a great collection pipeline already in place.<\/p>\n<p id=\"8bdf\" data-selectable-paragraph=\"\"><strong>Collecting Data<\/strong>\u00a0\u2014 Collecting data is almost always an underestimated component of spinning up an ML project. Some factors to consider when building a data collection and processing strategy are described in a\u00a0<a href=\"https:\/\/medium.com\/thelaunchpad\/where-does-data-come-from-6115ed2a3a3b\" class=\"broken_link\" rel=\"noopener\">previous post<\/a>.<\/p>\n<p id=\"deaa\" data-selectable-paragraph=\"\"><strong>Infrastructure<\/strong>\u00a0\u2014 Infrastructure building, which is mostly a software engineering task as opposed to an \u201cML task,\u201d is one of the most time-consuming parts of most projects.<\/p>\n<p id=\"d9ac\" data-selectable-paragraph=\"\"><strong>Optimizing ML Algorithm<\/strong>\u00a0\u2014 The task of training and optimizing ML models almost always takes\u00a0<em>less<\/em>\u00a0time and effort than anticipated for two reasons. First, performance is a strong function of what data you possess. Tweaking algorithms yields benefits, however, pales in comparison to cleaning up your data. Second, tools for optimizing ML algorithms (like\u00a0<a href=\"https:\/\/cloud.google.com\/automl\/\" rel=\"noopener\">AutoML<\/a>) make it much easier and faster to train and optimize models based on labeled or unlabeled data.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-914c7f1 elementor-widget elementor-widget-text-editor\" data-id=\"914c7f1\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"fe1a\" data-selectable-paragraph=\"\"><strong>Integration<\/strong>\u00a0\u2014 integration is another underestimated part of the ML deployment process. Error and exception handling, redundancies, and the challenge of moving from a static product to one of continuous iterations presents a host of software, product, and engineering challenges. Just think of all the technical debt hidden inside of your training data!<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-923cb81 elementor-widget elementor-widget-text-editor\" data-id=\"923cb81\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"379c\" data-selectable-paragraph=\"\">ML actually has two surprises.<\/p>\n<p id=\"dcf7\" data-selectable-paragraph=\"\">First, many companies are wrong about which parts of the ML implementation process will be difficult. Tools and technical advances are dramatically changing ML optimization at a rate unmatched by software infrastructure for brute force data collection and management. Like the broccoli ice cream \u2014 there is usually not that much ML in an end-to-end ML system.<\/p>\n<p id=\"b6b1\" data-selectable-paragraph=\"\">Secondly, the\u00a0<strong>path<\/strong>\u00a0of implementing ML (asking questions about your customers, building infrastructure to collect, interpret and act upon that data, etc.) is valuable, regardless of whether or not ML is actually implemented in the end. Not every problem has an ML-powered solution, but many do, and even those that do not will benefit from this journey.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-97aca9c elementor-widget elementor-widget-text-editor\" data-id=\"97aca9c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p data-selectable-paragraph=\"\">This article has first appeared on <a href=\"https:\/\/medium.com\/thelaunchpad\/the-ml-surprise-f54706361a6c\" class=\"broken_link\" rel=\"noopener\">Medium<\/a>.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>Machine learning (ML) has a surprise, too. One of the biggest misconceptions about ML deployment within organizations is comprehending the difficulty and the value. Integrating ML into your business workflows can be broken down into five activities. Optimizing an ML algorithm takes much less relative effort, but&nbsp;collecting data,&nbsp;building infrastructure,&nbsp;and&nbsp;integration&nbsp;each&nbsp;take much more work. The differences between expectations and reality are profound. Not every problem has an ML-powered solution, but many do, and even those that do not will benefit from this journey.<\/p>\n","protected":false},"author":610,"featured_media":3528,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"content-type":"","footnotes":""},"categories":[183],"tags":[92],"ppma_author":[2719],"class_list":["post-1866","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-ml","tag-machine-learning"],"authors":[{"term_id":2719,"user_id":610,"is_guest":0,"slug":"josh-cogan","display_name":"Josh Cogan","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/?s=96&d=mm&r=g","user_url":"","last_name":"Cogan","first_name":"Josh","job_title":"","description":"Josh Cogan, PhD, is a Tech Lead and Manager in the Cloud AI group at Google. He is passionate about intersection of business, history, information theory, and ML."}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1866","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/610"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=1866"}],"version-history":[{"count":6,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1866\/revisions"}],"predecessor-version":[{"id":36904,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1866\/revisions\/36904"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/3528"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=1866"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=1866"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=1866"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=1866"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}