{"id":1266,"date":"2019-02-15T10:32:02","date_gmt":"2019-02-15T10:32:02","guid":{"rendered":"http:\/\/kusuaks7\/?p=871"},"modified":"2021-05-17T18:14:54","modified_gmt":"2021-05-17T18:14:54","slug":"why-i-think-machine-learning-enhanced-software-systems-are-the-future","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/ai-ml\/why-i-think-machine-learning-enhanced-software-systems-are-the-future\/","title":{"rendered":"Why I think machine learning-enhanced software systems are the future."},"content":{"rendered":"<p><strong><em>Ready to learn Machine Learning? <a href=\"https:\/\/www.experfy.com\/training\/courses\">Browse courses<\/a>&nbsp;like&nbsp;<a href=\"https:\/\/www.experfy.com\/training\/courses\/machine-learning-foundations-supervised-learning\">Machine Learning Foundations: Supervised Learning<\/a> developed by industry thought leaders and Experfy in Harvard Innovation Lab.<\/em><\/strong><\/p>\n<p style=\"text-align: center;\"><img decoding=\"async\" alt=\"experfy-blog\" src=\"https:\/\/cdn-images-1.medium.com\/max\/2000\/1*OOWSoWHeQ5kyJ4N0P2ptNA.png\" style=\"width: 620px; height: 349px;\"><\/p>\n<p>I have been brewing the idea of using machine learning to improve software systems since 2016. It was pretty vague and broad, without an actionable plan. I just had the intuition \u2014 the software configuration and tuning, especially after the adoption of microservices, was getting too complex.<\/p>\n<h3 style=\"margin-left: -1.2pt;\"><strong>The increasing complexity of configuring and tuning&nbsp;systems<\/strong><\/h3>\n<p>If you have enough experience in the software industry, then it\u2019s very likely that you\u2019ve struggled with either a configuration problem or a tuning problem.<\/p>\n<p>Configuration and tuning problems are pretty common and can lead to really bad outages. They often occur when:<\/p>\n<ol>\n<li>Some parts of the system are poorly or wrongly configured, or<\/li>\n<li>A configuration that worked before now doesn\u2019t work because the context of the system has changed.<\/li>\n<\/ol>\n<p>Think of a number of database replicas and their writing schemes. Or in Postgresql, think of the number of shared buffers, effective cache size, and the min and max wal size.<\/p>\n<p>If wrongly configured from the start, it won\u2019t work in the given context, plain and simple. What\u2019s more interesting, though, is if it\u2019s&nbsp;<em>correctly<\/em>&nbsp;configured, it might work at a given time. But as the context changes \u2014 system workload, system resources usage, overall system architecture \u2014 the system will behave poorly. Or, even worse, an outage might happen.<\/p>\n<p>This will, inevitably, lead to manually-performed operations and the creation of heuristics. In other words, it will lead to:<\/p>\n<p><em>Oh, we should set X to A, when workload is T, but it should be A+10 when workload is T+100 and we have system resources usage above 80%\u2026 I guess. Or maybe let\u2019s just up a queue in front of this component, queues solve everything, right?<\/em><\/p>\n<p>Now multiply this scenario by tens or hundreds of services. Think for a second about the cognitive burden resulting from these configurations.<\/p>\n<p>This is not a new concern. In 2003, Ganek and Corbi&nbsp;<a href=\"http:\/\/ieeexplore.ieee.org\/document\/5386835\/?reload=true\" target=\"_blank\" rel=\"noopener noreferrer\">discussed<\/a>&nbsp;the need for autonomic computing to handle the complexity of managing software systems. They noted that managing complex systems became too costly, labor-intensive, and prone to error due to the pressure engineers felt while maintaining them. This increased the potential of system outages with a concurrent impact on business.<\/p>\n<p>Even nowadays, most of the configurations and tuning of the systems are performed manually, often in run-time, which is known to be a very time-consuming and risky practice. Check out these two links (<a href=\"https:\/\/link.springer.com\/book\/10.1007\/978-3-642-35813-5\" target=\"_blank\" rel=\"noopener noreferrer\">here<\/a>&nbsp;and&nbsp;<a href=\"http:\/\/citeseerx.ist.psu.edu\/viewdoc\/download?doi=10.1.1.90.8651&amp;rep=rep1&amp;type=pdf\" target=\"_blank\" rel=\"noopener noreferrer\">here<\/a>) to read more about it.<\/p>\n<p style=\"text-align: center;\"><img decoding=\"async\" alt=\"experfy-blog\" src=\"https:\/\/cdn-images-1.medium.com\/max\/1600\/1*6Kh0EXVHQ9zmau8kLs4pzQ.jpeg\" style=\"width: 400px; height: 299px;\"><\/p>\n<h3 style=\"margin-left: -1.2pt;\"><strong>The need for autonomic computing<\/strong><\/h3>\n<p>Most decisions to configure and tune the system are made based on the context \u2014 there are many different variables such as workload, number of instances of some services, resources usage, and more. So why not delegate these tasks to something that excels at exactly that?&nbsp;<em>Machine learning sounds like a feasible tool for the job.<\/em><\/p>\n<p>After starting my Masters at the University of British Columbia, I kept working on this idea. It seemed interesting although quite weird, and, sometimes, unpractical and impossible to implement.<\/p>\n<p>To my surprise, I realized I wasn\u2019t alone. Some very interesting people were working on these ideas \u2014 so it might not be that weird, unpractical, and impossible.<\/p>\n<p>Recently, Jeff Dean \u2014 a man that I admire a lot \u2014 <a href=\"https:\/\/news.ycombinator.com\/item?id=15892956\" target=\"_blank\" rel=\"noopener noreferrer\">gave a talk at NIPS 2017 talking about machine learning for systems<\/a>, where he stated:<\/p>\n<p><em>Learning should be used throughout our computing systems. Traditional low-level systems code (operating systems, compilers, storage systems) does not make extensive use of machine learning today. This should&nbsp;change!<\/em><\/p>\n<p><em>Computer Systems are filled with heuristics: compilers, networking code, operating systems. Heuristics have to work well \u201cin general case\u201d. [They] generally don\u2019t adapt to actual pattern of usage and don\u2019t take into account available context<\/em><\/p>\n<p><em>Learning in the core of all of our computer systems will make them better\/more adaptive.<\/em><\/p>\n<p>I was in complete awe when I read this. One of the engineers I admire the most was talking about the very same ideas I\u2019ve been thinking about and working on.<\/p>\n<p>This led me to think that it\u2019s not only interesting but&nbsp;<strong>natural to think about enhancing software systems with machine learning.<\/strong>&nbsp;Throughout the whole software stack, we have many heuristics that, although they work well, could be improved by machine learning.<\/p>\n<p>Is it challenging and potentially risky? Yes, most definitely. Especially given that interpretability, apparently, has become a secondary goal in the machine learning community. How can we interpret and explain the decisions made by neural nets?<\/p>\n<p>However, with that said, these obstacles shouldn\u2019t hinder scientific and technological progress.&nbsp;<a href=\"https:\/\/arxiv.org\/pdf\/1712.01208.pdf\" target=\"_blank\" rel=\"noopener noreferrer\">Yes, we should question old paradigms&nbsp;<\/a>and try to improve things.<\/p>\n<p style=\"text-align: center;\"><img decoding=\"async\" alt=\"experfy-blog\" src=\"https:\/\/cdn-images-1.medium.com\/max\/1600\/1*puiL2EVDE6Ztlocw3JD1uQ.png\" style=\"width: 679px; height: 243px;\"><\/p>\n<h3 style=\"margin-left: -1.2pt;\"><strong>Towards machine learning-enhanced software&nbsp;systems<\/strong><\/h3>\n<p>As Jeff Dean pointed out: we need to find&nbsp;<strong>practical<\/strong>&nbsp;ways to make systems data-aware. We need systems that collect metrics and metadata about themselves. To achieve this, we could learn a thing a two from the ideas in systems observability and instrumentation. We have been instrumenting systems for decades, and the data is already there.<\/p>\n<p>We also need to find&nbsp;<strong>practical<\/strong>&nbsp;and&nbsp;<strong>clean<\/strong>&nbsp;ways to&nbsp;<strong>integrate<\/strong>&nbsp;machine learning components into software systems, making learning a first-class citizen in the system. This will lead to&nbsp;<strong>systems that learn how to improve themselves,<\/strong>beating heuristics and manually-performed operations. Think about this for a second. It does sound cool&nbsp;<em>and<\/em>&nbsp;feasible.<\/p>\n<p>I would also add that we need&nbsp;<strong>practical<\/strong>&nbsp;and&nbsp;<strong>clean<\/strong>&nbsp;ways to propagate the decisions made by the learned models to the rest of the system. This would allow the system to have self-adaptive capabilities. Here, we could learn something from the control theory community.<\/p>\n<p>The general idea is fairly simple: make a system learn about its behavior by training a model on its context. Then allow it to change its structures and configurations in order to optimize for a certain scenario. Now implement this idea in such a way that it could be possible to integrate it into many kinds of systems.<\/p>\n<h3 style=\"margin-left: -1.2pt;\"><strong>Summary<\/strong><\/h3>\n<p>The most interesting questions I have in mind are:<\/p>\n<ol>\n<li>Can self-adaptation by learned models lead to more stable, faster, safer software systems? Can it reduce the need for manually configuring and tuning systems, allowing engineers to focus on more important tasks?<\/li>\n<li>Can this be easily integrated into software systems, requiring only small changes to the codebase?<\/li>\n<li>Can this work with low overhead?<\/li>\n<\/ol>\n<p>It is worth noting that this&nbsp;<strong>would not<\/strong>&nbsp;replace good engineers, but would rather free the engineers\u2019 cognitive abilities to focus on what matters.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The general idea is fairly simple: make a system learn about its behavior by training a model on its context. Then allow it to change its structures and configurations in order to optimize for a certain scenario. Now implement this idea in such a way that it could be possible to integrate it into many kinds of systems.<\/p>\n","protected":false},"author":189,"featured_media":21930,"comment_status":"open","ping_status":"open","sticky":false,"template":"single-post-2.php","format":"standard","meta":{"content-type":"","footnotes":""},"categories":[183],"tags":[97],"ppma_author":[2790],"class_list":["post-1266","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-ml","tag-artificial-intelligence"],"authors":[{"term_id":2790,"user_id":189,"is_guest":0,"slug":"rodrigo-araujo","display_name":"Rodrigo Ara\u00fajo","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/?s=96&d=mm&r=g","user_url":"","last_name":"Ara\u00fajo","first_name":"Rodrigo","job_title":"","description":"Rodrigo Araujo worked as software engineer serving highly scalable ML models and systems. His research is on using machine learning to build self-adaptive systems; systems that can learn how to improve themselves and adapt to different contexts."}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1266","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/189"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=1266"}],"version-history":[{"count":3,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1266\/revisions"}],"predecessor-version":[{"id":21933,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1266\/revisions\/21933"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/21930"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=1266"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=1266"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=1266"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=1266"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}