{"id":1558,"date":"2019-03-07T01:35:57","date_gmt":"2019-03-07T01:35:57","guid":{"rendered":"http:\/\/kusuaks7\/?p=1163"},"modified":"2023-07-28T05:47:53","modified_gmt":"2023-07-28T05:47:53","slug":"scaling-machine-learning-from-0-to-millions-of-users-part-1","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/ai-ml\/scaling-machine-learning-from-0-to-millions-of-users-part-1\/","title":{"rendered":"Scaling Machine Learning from 0 to Millions of Users\u200a\u2014\u200aPart 1"},"content":{"rendered":"<section>\n<p id=\"8eca\">I suppose most Machine Learning (ML) models are conceived on a whiteboard or a napkin, and born on a laptop. As the fledgling creatures start babbling their first predictions, we\u2019re filled with pride and high hopes for their future abilities. Alas, we know deep down in our heart that that not all of them will be successful, far from it.<\/p>\n<p id=\"1a62\">A small number fail us quickly as we build them. Others look promising, and demonstrate some level of predictive power. We are then faced with the grim challenge of deploying them in a production environment, where they\u2019ll either prove their legendary valour or die an inglorious death<\/p>\n<figure id=\"213f\"><canvas width=\"75\" height=\"34\"><\/canvas><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/640\/0*at743M34p8AdMI8S.jpg\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/640\/0*at743M34p8AdMI8S.jpg\" \/><figcaption>\u00a0<\/figcaption><\/figure>\n<p style=\"text-align: center;\"><span style=\"font-size: 11px;\">One day, your models will rule the world\u2026 if you read all these posts and pay attention\u00a0\ud83d\ude09<\/span><\/p>\n<p id=\"5949\">In this series of opinionated posts, we\u2019ll discuss\u00a0<strong>how to train ML models and deploy them to production, from humble beginnings to world domination<\/strong>. Along the way, we\u2019ll try to take justified and reasonable steps, fighting the evil forces of over-engineering, Hype Driven Development and \u201cwhy don\u2019t you just use XYZ?\u201d.<\/p>\n<blockquote id=\"c4f5\"><p>Enjoy the safe comfort of your Data Science sandbox while you can, and prepare yourself for the cold, harsh world of production.<\/p><\/blockquote>\n<h3 id=\"7260\"><strong>Day 0<\/strong><\/h3>\n<p id=\"afd8\">So you want to build a ML model. Hmmm. Let\u2019s pause for a minute and consider this:<\/p>\n<ul>\n<li id=\"c0f3\">Could your business problem be addressed by a\u00a0<strong>high-level AWS service<\/strong>, such as\u00a0<a href=\"http:\/\/aws.amazon.com\/rekognition\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/aws.amazon.com\/rekognition\" data->Amazon Rekognition<\/a>,\u00a0<a href=\"http:\/\/aws.amazon.com\/rekognition\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/aws.amazon.com\/rekognition\" data->Amazon Polly<\/a>, etc.?<\/li>\n<li id=\"cabe\">Or by the\u00a0<a href=\"https:\/\/medium.com\/@julsimon\/applying-machine-learning-to-aws-services-9768f926f11f\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/medium.com\/@julsimon\/applying-machine-learning-to-aws-services-9768f926f11f\" data->growing list of applied ML features<\/a>\u00a0embedded in other AWS services?<\/li>\n<\/ul>\n<p id=\"aeb3\">Don\u2019t wave this off:\u00a0<strong>no Machine Learning is easier to manage than no Machine Learning<\/strong>. Figuring a way to use high-level services could save you weeks of work, maybe months.<\/p>\n<h4 id=\"a08e\">If the answer is\u00a0\u201cyes\u201d<\/h4>\n<p id=\"914a\">Please ask yourself:<\/p>\n<ul>\n<li id=\"7e89\">Why would you go through all the trouble of building a redundant custom solution?<\/li>\n<li id=\"a8c6\">Are you really \u201c<em>missing features<\/em>\u201d? What\u2019s the\u00a0<strong>real<\/strong>\u00a0business impact?<\/li>\n<li id=\"3394\">Do you really need \u201c<em>more accuracy<\/em>\u201d How do you know\u00a0<strong>you<\/strong>\u00a0could reach it?<\/li>\n<\/ul>\n<p id=\"8a4f\">If you\u2019re unsure, why not run a\u00a0<strong>quick PoC\u00a0<\/strong>with your own data? These services are fully-managed (no\u2026 more\u2026 servers) and very easy to integrate in any application. It shouldn\u2019t take a lot of time to figure them out, and you would then have solid data to make an\u00a0<strong>educated decision<\/strong>\u00a0on whether you really need to train your own model or not.<\/p>\n<blockquote id=\"a400\"><p>If these services work well enough for you, congratulations, you\u2019re mostly done! If you decide to build, I\u2019d love to hear your feedback. Please get in touch.<\/p><\/blockquote>\n<h4 id=\"4cf6\"><strong>If this answer is\u00a0\u201cno\u201d<\/strong><\/h4>\n<p id=\"69c4\">Please ask yourself the question again! Most of us have an amazing capability to twist reality and deceive ourselves\u00a0\ud83d\ude42 If the honest answer is really \u201cno\u201d, then I\u2019d still recommend thinking about\u00a0<strong>subprocesses<\/strong>\u00a0where you could use the high-level services, e.g.\u00a0:<\/p>\n<ul>\n<li id=\"7176\">using\u00a0<a href=\"http:\/\/aws.amazon.com\/translate\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/aws.amazon.com\/translate\" data->Amazon Translate<\/a>\u00a0for supported language pairs and using your own solution for the rest.<\/li>\n<li id=\"2f73\">using\u00a0<a href=\"http:\/\/aws.amazon.com\/rekognition\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/aws.amazon.com\/rekognition\" data->Amazon Rekognition<\/a>\u00a0to detect faces before feeding them to your model,<\/li>\n<li id=\"147b\">using\u00a0<a href=\"http:\/\/aws.amazon.com\/textract\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/aws.amazon.com\/textract\" data->Amazon Textract<\/a>\u00a0to extract text before feeding it to your NLP model.<\/li>\n<\/ul>\n<p id=\"3885\">This isn\u2019t about pitching AWS services (do I look like a salesperson?). I\u2019m simply\u00a0<strong>trying to save you from reinventing the wheel\u00a0<\/strong>(or parts of the wheel): you should really be\u00a0<strong>focusing on the business problem<\/strong>\u00a0at hand, instead of building a house of cards that you read about in a blog post or saw at a conference. Yes, it may look great on your resume, and the wheel is initially a fun merry-go-round\u2026 and then, it turns into\u00a0<strong>the Wheel of Pain, you\u2019re chained to it and someone else is holding the whip<\/strong>.<\/p>\n<figure id=\"4e63\"><canvas width=\"75\" height=\"50\"><\/canvas><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/640\/0*MK8gZmjzXiAIkTee.png\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/640\/0*MK8gZmjzXiAIkTee.png\" \/><\/figure>\n<p style=\"text-align: center;\"><span style=\"font-size: 11px;\">Why did I blindly trust that meetup talk? Crom! Help me escape and bash that guy\u2019s skull with his\u00a0laptop.<\/span><\/p>\n<p id=\"9e54\">Anyway, enough negativity\u00a0\ud83d\ude42 You do need a model, let\u2019s move on.<\/p>\n<h3 id=\"0adf\">Day 1: one user\u00a0(you)<\/h3>\n<p id=\"06ad\">We\u2019ll start our journey at the stage where you\u2019ve trained a model on your local machine (or a local dev server), using a popular open source library like\u00a0<a href=\"https:\/\/scikit-learn.org\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/scikit-learn.org\">scikit-learn<\/a>,\u00a0<a href=\"https:\/\/www.tensorflow.org\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.tensorflow.org\">TensorFlow<\/a>\u00a0or\u00a0<a href=\"https:\/\/mxnet.incubator.apache.org\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/mxnet.incubator.apache.org\">Apache MXNet<\/a>. Maybe you\u2019ve even implemented your own algorithm (Data scientists, you devils).<\/p>\n<p id=\"2766\">You\u2019ve measured the model\u2019s accuracy using your test set, and things look good. Now you\u2019d like to deploy the model to production in order to check its actual behaviour, run A\/B tests, etc. Where to start?<\/p>\n<h4 id=\"e3c4\">Batch prediction or real-time prediction?<\/h4>\n<p id=\"bee5\">First, you should figure out whether your application requires\u00a0<strong>batch prediction<\/strong>\u00a0(i.e. collect a large number data points, process them periodically and store results somewhere), or\u00a0<strong>real-time prediction<\/strong>\u00a0(i.e. send a data point to a web service and receive an immediate prediction ). The reason why I bring this point early on is because it has a large impact on deployment complexity.<\/p>\n<p id=\"bd97\">At first sight, real-time prediction sounds more appealing (because\u2026 real-time, yeah!), but it also comes with stronger requirements, inherent in web services: high availability, the ability to handle traffic bursts, etc. Batch is more relaxed, as it only needs to run every now and then: as long as you don\u2019t lose data, no one will see if it\u2019s broken in between\u00a0\ud83d\ude09<\/p>\n<p id=\"9357\">Scaling is not a concern right now: all you care about is deploying your model, kicking the tires, running some performance tests, etc. From my experience,\u00a0<strong>you\u2019ve probably taken the shortest route and deployed everything to a single Amazon EC2 instance<\/strong>. Everybody knows a bit of Linux CLI, and you read somewhere that using \u201cIaaS will protect you from evil vendor lock-in\u201d. Ha! EC2 it is, then!<\/p>\n<blockquote id=\"49ea\"><p>I hear screams of horror and disbelief across the AWS time-space continuum, and maybe some snarky comments along the lines of \u201coh this is totally stupid, no one actually does that!\u201d. Well, I\u2019ll put money on the fact that this is by far how most people get started. Congrats if you didn\u2019t, but please let me show these good people which way is out before they really hurt themselves\u00a0\ud83d\ude09<\/p><\/blockquote>\n<p id=\"f8c6\">And so, staring into my magic mirror, I see\u2026<\/p>\n<h4 id=\"75f3\">Batch prediction<\/h4>\n<p id=\"bff1\">You\u2019ve copied your model, your batch script and your application to an EC2 instance. Your batch script runs periodically as a cron job, and saves predicted data to local storage. Your application loads both the model and initial predicted data at startup, and uses it to do whatever it has to do. It also periodically checks for updated predictions, and loads them whenever they\u2019re available.<\/p>\n<h4 id=\"b9cb\">Real-time prediction<\/h4>\n<p id=\"3b3d\">You\u2019ve embedded the model in your application, loading it at startup and serving predictions using all kinds of data (user input, files, APIs, etc.).<\/p>\n<p id=\"6938\">One way or the other, you\u2019re now running predictions in the cloud, and life is good. You celebrate with a pint of stout\u2026 or maybe gluten-free, fair-trade, organic soy milk latte, because it\u2019s 2019 after all.<\/p>\n<h3 id=\"a128\">Week 1: one sorry user\u00a0(you)<\/h3>\n<p id=\"eac3\">The model predicts nicely, and you\u2019d like to invest more time in collecting more data and adding features. Unfortunately, it didn\u2019t take long for things to go wrong and you\u2019re now\u00a0<strong>bogged down in all kinds of issues<\/strong>\u00a0(non exhaustive list below):<\/p>\n<ul>\n<li id=\"8262\">Training on your laptop and deploying manually to the cloud is painful and error-prone.<\/li>\n<li id=\"9249\">You accidentally terminated your EC2 instance and had to reinstall everything from scratch.<\/li>\n<li id=\"a64c\">You \u2018<em>pip install<\/em>\u2019-ed a Python library and now your EC2 instance is all messed up.<\/li>\n<li id=\"83ef\">You had to manually install two other instances for your colleagues, and now you can\u2019t really be sure that you\u2019re all using identical environments.<\/li>\n<li id=\"59c4\">Your first load test failed, but you\u2019re not sure what to blame: application? model? the ancients wizards of Acheron?<\/li>\n<li id=\"a93b\">You\u2019d like to implement the same algorithm in TensorFlow, and maybe Apache MXNet too: more environments, more deployments. No time for that.<\/li>\n<li id=\"c3a8\">And of course, everyone\u2019s favorite: Sales have heard that \u201cyour product now has AI capabilities\u201d. You\u2019re terrified that they could sell it to a customer and ask you to go live at scale next week.<\/li>\n<\/ul>\n<p id=\"ebb5\">The list goes on. It would be funny if it wasn\u2019t real (feel free to add your own examples in the comments). All of the sudden, this ML adventure doesn\u2019t sound as exciting, does it?\u00a0<strong>You\u2019re spending most of your time on firefighting, not on building the best possible model<\/strong>. This can\u2019t go on!<\/p>\n<figure id=\"9908\"><canvas width=\"75\" height=\"46\"><\/canvas><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/640\/0*e2Y4pLSeeGOT0tKv\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/640\/0*e2Y4pLSeeGOT0tKv\" \/><\/figure>\n<p style=\"text-align: center;\"><span style=\"font-size: 11px;\">I\u2019ve revoked your IAM credentials on \u2018<em>TerminateInstances<\/em>\u2019. Yes, even in the dev account. Any questions?<\/span><\/p>\n<h3 id=\"2343\">Week 2: fighting\u00a0back<\/h3>\n<p id=\"fede\">Someone on the team watched this really cool AWS video, featuring a new ML service called\u00a0<a href=\"http:\/\/aws.amazon.com\/sagemaker\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/aws.amazon.com\/sagemaker\" data->Amazon SageMaker<\/a>. You make a mental note of it, but right now, there\u2019s no time to rebuild everything: Sales is breathing down your neck, you have a customer demo in a few days, and you need to harden the existing solution.<\/p>\n<p id=\"03fb\">Chances are, you don\u2019t have a mountain of data yet: training can wait. You need to focus on making prediction reliable. Here are some solid techniques measures that won\u2019t take more than a few days to implement.<\/p>\n<h4 id=\"c5d0\">Use the Deep Learning\u00a0AMI<\/h4>\n<p id=\"d4f3\">Maintained by AWS, this\u00a0<a href=\"https:\/\/aws.amazon.com\/machine-learning\/amis\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/aws.amazon.com\/machine-learning\/amis\/\" data->Amazon Machine Image<\/a>\u00a0comes\u00a0<strong>pre-installed<\/strong>\u00a0with a lot of tools and libraries that you\u2019ll probably need: open source, NVIDIA drivers, etc. Not having to manage them will save you a lot of time, and will also guarantee that your multiple instances run with the same setup.<\/p>\n<p id=\"33ce\">The AMI also comes with the\u00a0<a href=\"https:\/\/conda.io\/en\/latest\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/conda.io\/en\/latest\/\" data->Conda<\/a>\u00a0<strong>dependency and environment manager<\/strong>, which lets you quickly and easily create many isolated environments: that\u2019s a great way to test your code with different Python versions or different libraries, without unexpectedly clobbering everything.<\/p>\n<p id=\"114a\">Last but not least, this AMI is\u00a0<strong>free of charge<\/strong>, and just like any other AMI, you can customize if you *really* have to.<\/p>\n<h4 id=\"38de\">Break the\u00a0monolith<\/h4>\n<p id=\"8b5e\">Your application code and your prediction code have\u00a0<strong>different requirements<\/strong>. Unless you have a compelling reason to do so (ultra low latency might be one), they shouldn\u2019t live under the same roof. Let\u2019s look at some reasons why:<\/p>\n<ul>\n<li id=\"7372\"><strong>Deployment<\/strong>: do you want to restart or update your app every time you update the model? Or ping your app to reload it or whatever? No no no no. Keep it simple: when it comes to decoupling,\u00a0<strong>nothing beats building separate services<\/strong>.<\/li>\n<li id=\"7674\"><strong>Performance<\/strong>: what if your application code runs best on memory-intensive instances and your ML model requires a GPU? How will you handle that trade-off? Why would you favour one or the other? Separating them lets you\u00a0<strong>pick the best instance type for each use case<\/strong>.<\/li>\n<li id=\"3df6\"><strong>Scalability<\/strong>: what if your application code and your model have different scalability profiles? It would be a shame to scale out on GPU instances because a small piece of your application code is running hot\u2026 Again, it\u2019s better to keep things separated, this will help take the most\u00a0<strong>appropriate scaling decisions<\/strong>\u00a0as well as reduce cost.<\/li>\n<\/ul>\n<p id=\"f5da\">Now, what about pre-processing \/ post-processing code, i.e. actions that you need to take on the data just before and just after predicting. Where should it go? It\u2019s hard to come up with a definitive answer: I\u2019d say that\u00a0<strong>model-independent actions<\/strong>\u00a0(formatting, logging, etc.) should stay in the application, whereas\u00a0<strong>model-dependent actions<\/strong>\u00a0(feature engineering) should stay close to the model to avoid deployment inconsistencies.<\/p>\n<h4 id=\"238c\">Build a prediction service<\/h4>\n<p id=\"d274\">Separating the prediction code from the application code doesn\u2019t have to be painful, and you can reuse\u00a0<strong>solid, scalable tools<\/strong>\u00a0to build a prediction service. Let\u2019s look at some options:<\/p>\n<ul>\n<li id=\"ba89\"><strong>Scikit-learn<\/strong>: when it comes to building web services in Python, I\u2019m a big fan of\u00a0<a href=\"http:\/\/flask.pocoo.org\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/flask.pocoo.org\">Flask<\/a>. It\u2019s neat, simple and it scales well. No need to look further IMHO. You code would look something like that.<\/li>\n<\/ul>\n<p style=\"text-align: center;\">\n<ul>\n<li id=\"1225\"><strong>TensorFlow<\/strong>: no coding required! You can use\u00a0<strong>TensorFlow Serving<\/strong>\u00a0to serve predictions at scale. Once you\u2019ve trained your model and saved it to the proper format, all it takes to serve predictions is:<\/li>\n<\/ul>\n<pre id=\"34b5\"><code>docker run -p 8500:8500 r\r\n--mount type=bind,source=\/tmp\/myModel,target=\/models\/myModel r\r\n-e MODEL_NAME=myModel -t tensorflow\/serving &amp;<\/code><\/pre>\n<ul>\n<li id=\"2b60\"><strong>Apache MXNet<\/strong>: in a similar way, Apache MXNet provides a\u00a0<a href=\"https:\/\/github.com\/awslabs\/mxnet-model-server\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/github.com\/awslabs\/mxnet-model-server\" data-><strong>model server<\/strong>,<\/a>able to serve MXNet and\u00a0<a href=\"https:\/\/onnx.ai\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/onnx.ai\/\" data-><strong>ONNX<\/strong><\/a>\u00a0models (the latter is a common format supported by PyTorch, Caffe2 and more). It can either run as a stand-alone application, or\u00a0<a href=\"https:\/\/github.com\/awslabs\/mxnet-model-server\/blob\/master\/docker\/README.md\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/github.com\/awslabs\/mxnet-model-server\/blob\/master\/docker\/README.md\" data->inside a Docker container<\/a>.<\/li>\n<\/ul>\n<p id=\"5ab6\">Both model servers are pre-installed on the\u00a0<strong>Deep Learning AMI:\u00a0<\/strong>that\u2019s another reason to use it. To keep things simple, you could leave your pre\/post-processing in the application and invoke the model deployed by the model server. A word of warning, however: these models servers implement neither authentication nor throttling, so please make sure not to expose them directly to Internet traffic.<\/p>\n<ul>\n<li id=\"0e66\"><strong>Anything else<\/strong>: if you\u2019re using another environment (say, custom code) or non-web architectures (say, message passing), the same pattern should apply: build a separate service that can be\u00a0<strong>deployed and scaled independently<\/strong>.<\/li>\n<\/ul>\n<h4 id=\"0d4e\">(Optional) Containerize your application<\/h4>\n<p id=\"77ca\">Since you\u2019ve decided to split your code, I would definitely recommend that you use the opportunity to package the different pieces in Docker containers: one for\u00a0<strong>training<\/strong>, one for\u00a0<strong>prediction<\/strong>, one (or more) for the\u00a0<strong>application<\/strong>. It\u2019s not strictly necessary at this stage, but if you can spare the time, I believe the premature investment is worth it.<\/p>\n<blockquote id=\"4746\"><p>If you\u2019ve been living under a rock or never really paid attention to containers, now\u2019s probably the time to catch up:) I highly recommend running the\u00a0<a href=\"https:\/\/docs.docker.com\/get-started\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/docs.docker.com\/get-started\/\" data->Docker tutorial<\/a>, which will teach you everything you need to know for our purpose.<\/p><\/blockquote>\n<p id=\"fa9c\">Containers make it easy to\u00a0<strong>move code across different environments<\/strong>\u00a0(dev, test, prod, etc.) and instances. They solve all kinds of dependency issues, which tend to pop up even if you\u2019re only managing a small number of instances. Later on, containers will also be a pre-requisite for larger-scale solutions such as Docker clusters or Amazon SageMaker.<\/p>\n<h3 id=\"2d30\">End of week\u00a02<\/h3>\n<p id=\"a52d\">After a rough start, things are looking up!<\/p>\n<ul>\n<li id=\"7da4\">The Deep Learning AMI provides a stable, well-maintained foundation to build on.<\/li>\n<li id=\"47a3\">Containers help you move and deploy your application with much less infrastructure drama than before.<\/li>\n<li id=\"98b1\">Prediction now lives outside of your application, making testing, deployment and scaling simpler.<\/li>\n<li id=\"4c6a\">If you can use them, model servers save you most of the trouble of writing a prediction service.<\/li>\n<\/ul>\n<p id=\"de8a\">Still, don\u2019t get too excited. Yes, we\u2019re back on track and ready to for bigger things, but there\u2019s still a ton of work to do. What about\u00a0<strong>scaling prediction to multiple instances,<\/strong>\u00a0<strong>high availability<\/strong>,\u00a0<strong>managing cost<\/strong>, etc. And what should we do when mountains of training data start piling up?\u00a0<strong>Face it, we\u2019ve barely scratched the surface<\/strong>.<\/p>\n<p id=\"e815\">\u201c<em>Old fool! Load balancers! Auto Scaling! Automation!<\/em>\u201d, I hear you cry. Oh, you mean you\u2019re in a hurry to manage infrastructure again? I thought you guys wanted to Machine Learning\u00a0\ud83d\ude09<\/p>\n<p id=\"c500\">On this bombshell, it\u2019s time to call it a day. In the next post,\u00a0we\u2019ll start comparing and challenging options for larger-scale ML training:\u00a0<strong>EC2<\/strong>\u00a0vs.\u00a0<strong>ECS\/EKS<\/strong>\u00a0vs\u00a0<strong>SageMaker<\/strong>. An epic battle, no doubt.<\/p>\n<\/section>\n","protected":false},"excerpt":{"rendered":"<p>So you want to build a ML model. No Machine Learning is easier to manage than no Machine Learning. Figuring a way to use high-level services could save you weeks of work, maybe months. In this series of posts, we&rsquo;ll discuss&nbsp;how to train ML models and deploy them to production, from humble beginnings to world domination. Along the way, we&rsquo;ll try to take justified and reasonable steps, fighting the evil forces of over-engineering.<\/p>\n","protected":false},"author":491,"featured_media":24124,"comment_status":"open","ping_status":"open","sticky":false,"template":"single-post-2.php","format":"standard","meta":{"footnotes":""},"categories":[183],"tags":[92],"ppma_author":[3117],"class_list":["post-1558","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-ml","tag-machine-learning"],"authors":[{"term_id":3117,"user_id":491,"is_guest":0,"slug":"julien-simon","display_name":"Julien SIMON","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/?s=96&d=mm&r=g","author_category":"","user_url":"","last_name":"SIMON","first_name":"Julien","job_title":"","description":"Julien SIMON&nbsp;is Global Technical Evangelist, Artificial Intelligence and Machine Learning at Amazon Web Services. He frequently speaks at conferences and holds all eight AWS certifications."}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1558","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/491"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=1558"}],"version-history":[{"count":0,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1558\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/24124"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=1558"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=1558"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=1558"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=1558"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}