{"id":1566,"date":"2019-03-11T04:50:12","date_gmt":"2019-03-11T04:50:12","guid":{"rendered":"http:\/\/kusuaks7\/?p=1171"},"modified":"2023-08-21T14:41:59","modified_gmt":"2023-08-21T14:41:59","slug":"the-cold-start-problem-how-to-build-your-machine-learning-portfolio","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/ai-ml\/the-cold-start-problem-how-to-build-your-machine-learning-portfolio\/","title":{"rendered":"The cold start problem: how to build your machine learning portfolio"},"content":{"rendered":"<p id=\"61a4\">Some time ago, I <a href=\"https:\/\/www.experfy.com\/blog\/the-cold-start-problem-how-to-break-into-machine-learning\">wrote about<\/a>\u00a0the things you should do to get hired into your first machine learning job. I said in that post that one thing you should do is build a portfolio of your personal machine learning projects. But I left out the part about how to actually to do that, so in this post, I\u2019ll tell you how. [1]<\/p>\n<p id=\"0886\">Because of what our startup does, I\u2019ve seen hundreds of examples of personal projects that ranged from very good to very bad. Let me tell you about two of the very good ones.<\/p>\n<h3 id=\"2f6f\">The all-in<\/h3>\n<p id=\"d4b3\">What follows is a true story, except that I\u2019ve changed names for privacy.<\/p>\n<p id=\"4f4e\">Company X uses AI to alert grocery stores when it\u2019s time for them to order new inventory. We had one student, Ron, who\u00a0<em>really<\/em>\u00a0wanted to work at Company X. Ron wanted to work at Company X so badly, in fact, that he built a personal project that was 100% dedicated to getting him an interview there.<\/p>\n<p id=\"1522\">We don\u2019t usually recommend going all-in on one company like this. It\u2019s risky to do if you\u2019re starting out. But \u2014 like I said \u2014 Ron\u00a0<em>really<\/em>\u00a0wanted to work at Company X.<\/p>\n<p id=\"4e24\">So what did Ron build?<\/p>\n<figure id=\"9963\"><canvas width=\"75\" height=\"50\"><\/canvas><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/640\/1*23FdOvHC2WOVIf4YgoVmQQ.png\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/640\/1*23FdOvHC2WOVIf4YgoVmQQ.png\" \/><\/figure>\n<p style=\"text-align: center;\"><span style=\"font-size: 11px;\">The red bounding boxes indicate missing\u00a0items.<\/span><\/p>\n<ol>\n<li id=\"e985\">Ron started by duct taping his phone to a grocery cart. Then he drove his cart up and down the aisles of a grocery store while he recorded the aisles with his camera. He did this 10\u201312 times at different grocery stores.<\/li>\n<li id=\"c445\">Once he got home, Ron started to build a machine learning model. His model identified empty spots in grocery store shelves \u2014 places where the cornflakes (or whatever) were missing from the shelves.<\/li>\n<li id=\"b62d\">Here\u2019s the awesome part: Ron built his model\u00a0<em>in real time<\/em>, on GitHub, in full public view. Every day, he\u2019d push improvements to his repo and chronicle the changes in his repo\u2019s README.<\/li>\n<li id=\"ec65\">When Company X realized Ron was doing this, Company X was intrigued. More than intrigued. In fact, Company X was slightly nervous. Why would they be nervous? Because Ron had unknowingly, and in a few days, reproduced a part of their proprietary tech stack. [2]<\/li>\n<\/ol>\n<p id=\"ed8d\">Company X is exceptionally competent, and their technology is among the best in their industry. Nonetheless, within 4 days, Ron\u2019s project had grabbed the direct personal attention of Company X\u2019s CEO.<\/p>\n<h3 id=\"715f\">The pilot\u00a0project<\/h3>\n<p id=\"75bb\">Here\u2019s another true story.<\/p>\n<p id=\"a25c\">Alex is a history major with a minor in Russian studies (really). Unusually for a history major, he got interested in machine learning. Even more unusually, he decided he would learn it, despite having never written a line of Python.<\/p>\n<p id=\"841a\">Alex chose to learn by building. He settled on building a classifier to detect if fighter pilots were losing consciousness in their airplanes. Alex wanted to detect this by looking at videos of pilots. He knew it was easy for a person to tell, just by looking, when a pilot is unconscious, so Alex figured it should be possible for a machine to tell, too.<\/p>\n<p id=\"b050\">Here\u2019s what Alex did, over the course of several months:<\/p>\n<figure id=\"ca5c\"><canvas width=\"75\" height=\"37\"><\/canvas><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/640\/1*QmRHEOydB-d2llPOZ7OUrA.png\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/640\/1*QmRHEOydB-d2llPOZ7OUrA.png\" \/><\/figure>\n<p style=\"text-align: center;\"><span style=\"font-size: 11px;\">A demo of Alex\u2019s G-force induced loss-of-consciousness detector.<\/span><\/p>\n<ol>\n<li id=\"ce7d\">Alex went on YouTube and downloaded every video clip of pilots flying planes, taken from the cockpit. (In case you\u2019re wondering, there are a few dozen of these clips.)<\/li>\n<li id=\"76b3\">Next he started to label his data. Alex built a UI that let him scroll through thousands of video frames, press one button for \u201cconscious\u201d and another button for \u201cunconscious\u201d, and automatically save that frame in the correctly labeled folder. This labeling was very, very boring and took him many, many days.<\/li>\n<li id=\"1a1c\">Alex built a data pipeline for the images that would crop the pilot out of the cockpit background \u2014 to make it easier for his classifier to focus on the pilot. Finally, he built his loss-of-consciousness classifier.<\/li>\n<li id=\"72fd\">At the same time as he was doing all these things, Alex was showing snapshots his project to hiring managers at networking events. Every time he took out his project and showed it off (on his phone), they asked him how he did it, about the pipeline he built, and how he collected his data. But they never quite got around to asking about his model\u2019s\u00a0<em>accuracy<\/em> \u2014 which was under 50%.<\/li>\n<\/ol>\n<p id=\"4360\">Alex planned to improve his accuracy, of course, but he was hired before he got the chance. It turned out that the visual impact of his project, and his relentless resourcefulness in data gathering, mattered much more to companies than how good his model actually was.<\/p>\n<p id=\"e3ca\">Did I mention Alex is a history major with a minor in Russian studies?<\/p>\n<h3 id=\"01eb\">What they have in\u00a0common<\/h3>\n<p id=\"a558\">What made Ron and Alex so successful? Here are four big things they did right:<\/p>\n<ol>\n<li id=\"2a1c\"><strong>Ron and Alex didn\u2019t spend much effort on modelling.<\/strong>\u00a0I know this sounds strange, but for many use cases nowadays modelling is a solved problem. In a real job, unless you\u2019re doing state of the art AI research, you\u2019ll be spending 80\u201390% of your time cleaning your data anyway. Why would your personal project be different?<\/li>\n<li id=\"5bba\"><strong>Ron and Alex gathered their own data.<\/strong>\u00a0Because of this, they ended up with data that was messier than what you\u2019d find in on Kaggle or the UCI data repository. But working with messy data taught them to deal with messy data. It also forced them to understand their data better than if they\u2019d downloaded it from an academic server.<\/li>\n<li id=\"2163\"><strong>Ron and Alex built visual things.<\/strong>\u00a0An interview isn\u2019t about your skills being objectively assessed by an all-knowing judge. An interview is about selling yourself to another human being. Human beings are visual creatures. So if you pull out your phone and show the interviewer what you built, it\u2019s worth making sure that what you\u2019ve built looks interesting.<\/li>\n<li id=\"872c\"><strong>What Ron and Alex did seems insane.<\/strong>\u00a0And it was insane. Normal people don\u2019t duct tape their phones to shopping carts. Normal people don\u2019t spend their days cropping pilots out of YouTube videos. You know who does that?\u00a0<em>People who will do whatever it takes get their work done.<\/em>\u00a0And companies\u00a0<em>really, really<\/em>\u00a0want to hire those people.<\/li>\n<\/ol>\n<p id=\"450e\">What Ron and Alex did might seem like too much work, but really, it isn\u2019t much more than you\u2019d be expected to do in a real job. And that\u2019s the whole point: when you don\u2019t have work experience doing X, hiring managers will look for things you\u2019ve done that simulate work experience doing X.<\/p>\n<p id=\"821f\">Fortunately you only need to do build a project at this level once or twice \u2014 Ron and Alex\u2019s projects got reused over and over for all their interviews.<\/p>\n<p id=\"5902\">So if I had to summarize the secret to a great ML project in one sentence, it would be:<em>\u00a0Build a project with an interesting dataset that took obvious effort to collect and make it as visually impactful as possible<\/em>.<\/p>\n<hr \/>\n<p id=\"c00f\">[1] In case you\u2019re wondering why this is important, it\u2019s because hiring managers try to assess you by looking at your track record. If you don\u2019t have a track record, personal projects are the closest substitute.<\/p>\n<p id=\"4f35\">[2] Of course Ron\u2019s attempt was far from perfect: Company X had devoted orders of magnitude more resources to the problem than he had. But it was similar enough that they quickly asked Ron to make his repo private.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>One thing you should do is build a portfolio of your personal machine learning projects. But, how to do that? I&rsquo;ve seen hundreds of examples of personal projects that ranged from very good to very bad. So in this post, I&rsquo;ll tell you how.&nbsp; &nbsp;If I had to summarize the secret to a great ML project in one sentence, it would be:&nbsp;Build a project with an interesting dataset that took obvious effort to collect and make it as visually impactful as possible.<\/p>\n","protected":false},"author":485,"featured_media":4100,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"content-type":"","footnotes":""},"categories":[183],"tags":[92],"ppma_author":[3109],"class_list":["post-1566","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-ml","tag-machine-learning"],"authors":[{"term_id":3109,"user_id":485,"is_guest":0,"slug":"edouard-harris","display_name":"Edouard Harris","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/?s=96&d=mm&r=g","user_url":"","last_name":"Harris","first_name":"Edouard","job_title":"","description":"Edouard Harris&nbsp;is CEO and co-founder at SharpestMinds (YC W18), an online mentorship program where senior engineers train fresh grads for free."}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1566","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/485"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=1566"}],"version-history":[{"count":2,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1566\/revisions"}],"predecessor-version":[{"id":31016,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1566\/revisions\/31016"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/4100"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=1566"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=1566"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=1566"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=1566"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}