{"id":1536,"date":"2019-02-26T00:31:00","date_gmt":"2019-02-26T00:31:00","guid":{"rendered":"http:\/\/kusuaks7\/?p=1141"},"modified":"2023-08-21T14:36:21","modified_gmt":"2023-08-21T14:36:21","slug":"the-cold-start-problem-how-to-break-into-machine-learning","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/ai-ml\/the-cold-start-problem-how-to-break-into-machine-learning\/","title":{"rendered":"The cold start problem: how to break into machine learning"},"content":{"rendered":"<p id=\"1075\">It\u2019s hard to get hired into your first machine learning role. That means I get asked lots of questions that look like: right now, I do X. I want to become a machine learning engineer. How do I do it?<\/p>\n<p id=\"db16\">Depending on what X is, that ranges from something that\u2019s pretty hard to do (physics, software engineering) up something that\u2019s extremely hard to do (UI design, marketing).<\/p>\n<p id=\"63e4\">So to help everyone at the same time, I\u2019ve put together a progression that you can follow from any starting point to actually become a machine learning engineer. Every ML engineer I\u2019ve met (who didn\u2019t go to grad school at\u00a0<a href=\"http:\/\/ai.stanford.edu\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/ai.stanford.edu\">SAIL<\/a>\u00a0or somewhere similar) went through some form of this.<\/p>\n<p id=\"d399\">Before you start learning ML, there\u2019s a set of basics you need first.<\/p>\n<h3 id=\"18f9\">1. Learn\u00a0calculus<\/h3>\n<p id=\"861c\">The first thing you need is multivariable calculus (up to second-year undergrad).<\/p>\n<p id=\"7eef\"><strong>Where to learn it:<\/strong>\u00a0<a href=\"https:\/\/www.khanacademy.org\/math\/differential-calculus\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.khanacademy.org\/math\/differential-calculus\" data->Khan Academy\u2019s differential calculus course<\/a>\u00a0is pretty good. Make sure to do the practice problems. Otherwise you\u2019ll just nod along with the course and won\u2019t learn anything.<\/p>\n<h3 id=\"a1f0\">2. Learn linear\u00a0algebra<\/h3>\n<p id=\"93a7\">The second thing you need is linear algebra (up to first-year undergrad).<\/p>\n<p id=\"c42b\"><strong>Where to learn it:<\/strong>\u00a0<a href=\"https:\/\/github.com\/fastai\/numerical-linear-algebra\/blob\/master\/README.md\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/github.com\/fastai\/numerical-linear-algebra\/blob\/master\/README.md\" data->Rachel Thomas\u2019s mini-course on computational linear algebra<\/a>\u00a0is targeted at people who want to learn ML.<\/p>\n<p id=\"eed2\"><strong>NOTE:<\/strong>\u00a0I\u2019ve heard convincing arguments that you can skip calculus and linear algebra. A few people I know have jumped right into ML and learned most of what they needed by trial and error and intuition, and they turned out okay. Your mileage will vary, but whatever you do, don\u2019t skip this next step:<\/p>\n<h3 id=\"6f4e\">3. Learn to\u00a0code<\/h3>\n<figure id=\"bff3\" data-scroll=\"native\"><canvas width=\"75\" height=\"34\"><\/canvas><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/480\/1*hdua8VJ2I-_VrqKbkr9bkg.png\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/480\/1*hdua8VJ2I-_VrqKbkr9bkg.png\" \/><\/figure>\n<p id=\"6416\">The last thing you need is programming experience in Python. You can do ML in other languages, but these days Python is the gold standard.<\/p>\n<p id=\"8ae4\"><strong>Where to learn it:<\/strong>\u00a0Follow the advice in\u00a0<a href=\"https:\/\/www.reddit.com\/r\/learnpython\/comments\/35iyuc\/what_is_the_best_way_to_learn_python\/\" target=\"_blank\" rel=\"noopener noreferrer\" class=\"broken_link\">the top answer of this Reddit thread<\/a>. You should also pay close attention to the\u00a0<a href=\"http:\/\/www.numpy.org\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/www.numpy.org\">numpy<\/a>\u00a0and\u00a0<a href=\"https:\/\/www.scipy.org\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.scipy.org\">scipy<\/a>\u00a0packages. Those come up a lot.<\/p>\n<p id=\"8ed7\">There\u2019s more to say about good programming practice than I have room for here. In one sentence: make your code\u00a0<a href=\"https:\/\/blog.hartleybrody.com\/python-style-guide\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/blog.hartleybrody.com\/python-style-guide\/\" data->legible<\/a>\u00a0and\u00a0<a href=\"https:\/\/docs.python-guide.org\/writing\/structure\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/docs.python-guide.org\/writing\/structure\/\" data->modular<\/a>, with good tests and\u00a0<a href=\"https:\/\/www.datacamp.com\/community\/tutorials\/exception-handling-python\" target=\"_blank\" rel=\"noopener noreferrer\" class=\"broken_link\">error handling<\/a>.<\/p>\n<p id=\"b701\"><strong>Pro tip:<\/strong>\u00a0If you\u2019re learning to code from scratch, don\u2019t bother memorizing every command. Just learn how to look up questions online fast. And yes,\u00a0<a href=\"https:\/\/twitter.com\/patio11\/status\/988508062431432704?lang=en\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/twitter.com\/patio11\/status\/988508062431432704?lang=en\" data->this is what the pros do<\/a>.<\/p>\n<p id=\"8c55\">Also:\u00a0<a href=\"http:\/\/try.github.io\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/try.github.io\">learn the basics of git<\/a>. It pays off fast.<\/p>\n<h3 id=\"9d69\">Learn machine\u00a0learning<\/h3>\n<p id=\"d72f\">Now you get to learn machine learning itself. In 2018, one of the best places to do that is Jeremy Howard\u2019s\u00a0<a href=\"http:\/\/www.fast.ai\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/www.fast.ai\">fast.ai<\/a>\u00a0course, which teaches ML at the state of the art with an approachable curriculum. Go through at least\u00a0<a href=\"http:\/\/course.fast.ai\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/course.fast.ai\">Course 1<\/a>, and ideally\u00a0Course 2, do all the exercises, and you\u2019ll be ahead of most industry practitioners on model-building (really).<\/p>\n<p id=\"d5c2\">Most of the progress in machine learning over the past 6 years has been in deep learning, but there\u2019s much more to the field. There are also decision trees, support vector machines, linear regression, and a bunch of other techniques. You\u2019ll run into these as you progress, but you can probably learn them as they come up. A great centralized place to learn and use them is Python\u2019s\u00a0<a href=\"http:\/\/scikit-learn.org\/stable\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/scikit-learn.org\/stable\/\" data->scikit-learn<\/a>\u00a0package.<\/p>\n<h3 id=\"9032\">Build personal\u00a0projects<\/h3>\n<figure id=\"5020\"><canvas width=\"75\" height=\"28\"><\/canvas><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/640\/1*dnvGC-PORSoCo7VXT3PV_A.png\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/640\/1*dnvGC-PORSoCo7VXT3PV_A.png\" \/><\/figure>\n<p id=\"be8a\">Everyone who applies to their first ML position has done personal projects in machine learning and data science, so you should too. But it\u2019s important to do it well, and I\u2019ll cover exactly how in a future post. For now, the only thing I\u2019ll say is:\u00a0the most common mistake I see when people showcase personal projects is that\u00a0<em>they apply well known algorithms to\u00a0<\/em><a href=\"https:\/\/www.kaggle.com\/c\/titanic\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.kaggle.com\/c\/titanic\" data-><em>well known datasets<\/em><\/a><em>.<\/em><\/p>\n<p id=\"6246\">This is a mistake because (1) machine learning hiring managers already know all the well known datasets, and (2) they also know that if you showcase a project where you apply a well known algorithm to a well known dataset, you might not know how to do much of anything else.<\/p>\n<h3 id=\"7066\">Some things are hard to learn by\u00a0yourself<\/h3>\n<p id=\"233d\">The truth is that a lot of the things that make you stand out from the crowd are hard to learn by yourself. In machine learning, the three biggest ones are (1) data prep, (2) ML devops, and (3) professional networking.<\/p>\n<p id=\"71fe\">Data prep is the hacks you use when you work with realistic data. That means dealing with outliers and missing values. But it also means collecting data yourself when there isn\u2019t already a dataset for the problem you want to solve. In real life, you\u2019ll spend 80% of your time cleaning and collecting data. Model-building is an afterthought in the real world, and engineering managers know that.<\/p>\n<p id=\"8a0e\">ML devops is what you do to run your model on the cloud. It costs money to rent compute time, so people sometimes don\u2019t do this in their personal projects. But if you can afford it, it\u2019s worth it to get familiar with the basics. Start with\u00a0<a href=\"https:\/\/www.paperspace.com\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.paperspace.com\">Paperspace<\/a>\u00a0or\u00a0<a href=\"https:\/\/www.floydhub.com\" target=\"_blank\" rel=\"noopener noreferrer\" class=\"broken_link\">Floyd<\/a>\u00a0for an intro to running ML on the cloud.<\/p>\n<p id=\"4fa9\">Honest engineers often ignore networking, because they think they should get hired on their skills alone. The real world doesn\u2019t work like that, even though it should. So talk to people. I\u2019ll write more about this part in a future post.<\/p>\n<h3 id=\"e36d\">Ask for\u00a0help<\/h3>\n<p id=\"96c1\">Some steps are hard to take on your own. Schools aren\u2019t good at teaching data prep, ML devops, or networking. Most people learn those things on the job, or from a mentor if they\u2019re lucky. Many people never learn them at all.<\/p>\n<p id=\"58cf\">But how do you bridge that gap in the general case? How do you get a job without experience when you need a job to get experience?<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Some steps are hard to take on your own. Schools aren&rsquo;t good at teaching data prep, ML devops, or networking. Most people learn those things on the job or from a mentor if they&rsquo;re lucky. Many people never learn them at all. But how do you bridge that gap in the general case? How do you get a job without experience when you need a job to get experience? So to help everyone at the same time, I&rsquo;ve put together a progression that you can follow from any starting point to actually become a machine learning engineer.<\/p>\n","protected":false},"author":485,"featured_media":3957,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"content-type":"","footnotes":""},"categories":[183],"tags":[92],"ppma_author":[3109],"class_list":["post-1536","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-ml","tag-machine-learning"],"authors":[{"term_id":3109,"user_id":485,"is_guest":0,"slug":"edouard-harris","display_name":"Edouard Harris","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/?s=96&d=mm&r=g","user_url":"","last_name":"Harris","first_name":"Edouard","job_title":"","description":"Edouard Harris&nbsp;is CEO and co-founder at SharpestMinds (YC W18), an online mentorship program where senior engineers train fresh grads for free."}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1536","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/485"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=1536"}],"version-history":[{"count":3,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1536\/revisions"}],"predecessor-version":[{"id":31015,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1536\/revisions\/31015"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/3957"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=1536"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=1536"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=1536"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=1536"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}