{"id":893,"date":"2018-09-20T05:25:23","date_gmt":"2018-09-20T05:25:23","guid":{"rendered":"http:\/\/kusuaks7\/?p=498"},"modified":"2023-07-26T13:40:37","modified_gmt":"2023-07-26T13:40:37","slug":"unsupervised-learning-demystified","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/bigdata-cloud\/unsupervised-learning-demystified\/","title":{"rendered":"Unsupervised learning demystified"},"content":{"rendered":"<p><strong><em>Ready to learn Machine Learning? Browse<\/em><\/strong> <strong><em><a href=\"https:\/\/www.experfy.com\/training\/tracks\/machine-learning-training-certification\">Machine Learning Training and Certification courses<\/a> developed by industry thought leaders and Experfy in Harvard Innovation Lab.<\/em><\/strong><\/p>\n<p id=\"efef\">Unsupervised learning may sound like a fancy way to say \u201c<em>let the kids learn on their own not to touch the hot oven\u201d\u00a0<\/em>but it\u2019s actually a pattern-finding technique for mining inspiration from your data. It has nothing to do with machines running around without adult supervision, forming their own opinions about things. Let\u2019s demystify!<\/p>\n<figure id=\"079e\"><canvas width=\"75\" height=\"40\"><\/canvas><img decoding=\"async\" style=\"width: 700px; height: 394px;\" src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/1*0nJvs2Fj4N7QeNs1_6UnBA.jpeg\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/1*0nJvs2Fj4N7QeNs1_6UnBA.jpeg\" \/><\/figure>\n<p id=\"fff5\" style=\"text-align: center;\">If this feels familiar, unsupervised machine learning might be your new best\u00a0friend.<\/p>\n<p>This post is beginner-friendly, but assumes you\u2019re familiar with\u00a0<strong>the story so far<\/strong>:<\/p>\n<ul>\n<li id=\"b123\">Machine learning is all about <a href=\"https:\/\/www.experfy.com\/blog\/the-simplest-explanation-of-machine-learning-youll-ever-read\">labeling things using examples.<\/a><\/li>\n<li id=\"e28d\">If you train your system by feeding it the answers you\u2019re looking for, you\u2019re doing <a href=\"https:\/\/www.experfy.com\/blog\/explaining-supervised-learning-to-a-kid-or-your-boss\">supervised learning<\/a>.<\/li>\n<li id=\"9de6\">To\u00a0<a href=\"https:\/\/hackernoon.com\/imagine-a-drunk-island-advice-for-finding-ai-use-cases-8d47495d4c3f\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/hackernoon.com\/imagine-a-drunk-island-advice-for-finding-ai-use-cases-8d47495d4c3f\" data->get started<\/a>\u00a0with supervised learning you need to know what labels you want. (Not so with unsupervised.)<\/li>\n<li id=\"2904\">Standard jargon includes <a href=\"https:\/\/www.experfy.com\/blog\/explaining-supervised-learning-to-a-kid-or-your-boss\">instance, feature, label, model, and algorithm<\/a>.<\/li>\n<\/ul>\n<h4 id=\"6c68\"><strong>What\u2019s unsupervised learning?<\/strong><\/h4>\n<figure id=\"d236\"><canvas width=\"75\" height=\"50\"><\/canvas><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/1*qZ0K8_atP_HvWv2vsjw5Hg.png\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/1*qZ0K8_atP_HvWv2vsjw5Hg.png\" \/><\/figure>\n<p id=\"483e\" style=\"text-align: center;\">Your mission? Put these six images into two groups however you\u00a0like.<\/p>\n<p>Check out the six <a href=\"https:\/\/www.experfy.com\/blog\/explaining-supervised-learning-to-a-kid-or-your-boss\">instances<\/a>\u00a0above. What\u2019s missing? These photographs are not accompanied by labels. No worries, your brain is pretty good at unsupervised learning. Let\u2019s try it.<\/p>\n<p id=\"d30c\">Think about how you\u2019d like to split these images into two groups. There are no wrong answers. Ready?<\/p>\n<h4 id=\"8557\"><strong>Clustering the\u00a0data<\/strong><\/h4>\n<p id=\"d923\">In a live class, Googlers shout out answers like \u201c<em>sitting versus standing,<\/em>\u201d \u201c<em>can see a wooden floor versus can\u2019t,<\/em>\u201d \u201c<em>cat selfie vs not cat selfie,<\/em>\u201d and so on. Let\u2019s examine the first answer.<\/p>\n<figure id=\"a286\"><canvas width=\"75\" height=\"32\"><\/canvas><img decoding=\"async\" style=\"width: 700px; height: 318px;\" src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/1*ylcjzeLDlHiuryx179JFjw.png\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/1*ylcjzeLDlHiuryx179JFjw.png\" \/><\/figure>\n<p style=\"text-align: center;\">One way to split the images in to two clusters: sitting versus standing. Well, \u201csitting\u201d versus standing.<\/p>\n<h4 id=\"0e9d\"><strong>Unsupervised learning\u2018s secret\u00a0labels<\/strong><\/h4>\n<p id=\"5260\">If you chose to define your clusters based on whether the cats are standing, what are the labels your system outputs? Machine learning is about l<a href=\"https:\/\/www.experfy.com\/blog\/the-simplest-explanation-of-machine-learning-youll-ever-read\">abeling things<\/a>, after all.<\/p>\n<p id=\"cf07\">If you\u2019re thinking \u201c<em>sitting vs standing<\/em>\u201d are the labels, think again! That\u2019s the recipe (model) you\u2019re using for creating your clusters. The labels in unsupervised learning are far more boring: something like \u201c<em>Group 1 and Group 2<\/em>\u201d or \u201c<em>A or B<\/em>\u201d or \u201c<em>0 or 1<\/em>\u201d. They simply indicate group membership, and they have no additional human-interpretable (or poetic) meaning.<\/p>\n<figure id=\"9c37\"><canvas width=\"75\" height=\"32\"><\/canvas><img decoding=\"async\" style=\"width: 700px; height: 319px;\" src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/1*SbqLoWH_Y-tXq6yqMo7lRw.png\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/1*SbqLoWH_Y-tXq6yqMo7lRw.png\" \/><img decoding=\"async\" style=\"width: 700px; height: 319px;\" src=\"https:\/\/cdn-images-1.medium.com\/max\/1000\/1*SbqLoWH_Y-tXq6yqMo7lRw.png\" \/><\/figure>\n<p id=\"bf86\" style=\"text-align: center;\">Unsupervised learning\u2019s labels simply indicate cluster membership. They have no higher human-interpretable meaning, as disappointingly boring as that may\u00a0feel.<\/p>\n<p>All that is happening here is that the algorithm groups things by similarity. The similarity measure is specified by the choice of algorithm, but why not try as many as possible? After all, you don\u2019t know what you\u2019re looking for and that\u2019s okay. Think of unsupervised learning as a sort of mathematical version of making \u201c<em>birds of a feather flock together.<\/em>\u201d<\/p>\n<p id=\"cea8\">Like a\u00a0<a href=\"https:\/\/en.m.wikipedia.org\/wiki\/Rorschach_test\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/en.m.wikipedia.org\/wiki\/Rorschach_test\" data->Rorschach card<\/a>, the results are there to help you dream. Don\u2019t take whatever you see in them too seriously.<\/p>\n<h4 id=\"fafa\"><strong>Look again!<\/strong><\/h4>\n<p id=\"dd4f\">As the proud mother of these two individual cats, I\u2019m saddened that in the 50 or so times I\u2019ve taught this lesson, only one audience noticed: \u201c<em>Cat 1 versus Cat 2.<\/em>\u201d Instead it\u2019s answers like \u201c<em>sitting, standing<\/em>\u201d or \u201c<em>wooden floor absent\/present<\/em>\u201d or sometimes even \u201c<em>ugly cats versus pretty cats.<\/em>\u201d (Awww.)<\/p>\n<figure id=\"68e9\"><canvas width=\"75\" height=\"35\"><\/canvas><img decoding=\"async\" style=\"width: 700px; height: 326px;\" src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/1*CASi5zfagbE-I0-55rYaLQ.png\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/1*CASi5zfagbE-I0-55rYaLQ.png\" \/><\/figure>\n<p style=\"text-align: center;\">Turns out these were photos of my two individual cats! Maybe you spotted it, but most of my audiences don\u2019t\u2026 unless I give them the labels (supervise their learning). If I\u2019d presented the data with name labels in the first place and then asked you to classify the next photo, I bet you\u2019d find the task\u00a0easy.<\/p>\n<h4 id=\"a4cf\"><strong>Lessons learned<\/strong><\/h4>\n<p id=\"516c\">Imagine I\u2019m a novice data scientist getting started with unsupervised learning and (naturally!) interested in my own two cats. I won\u2019t be able to unsee my cats when I look at these data. Because my cuddlebugs are so meaningful to me, I expect my unsupervised machine learning system to be able to recover the only thing worth caring about here. Oops!<\/p>\n<p id=\"0932\">Before this decade, computers couldn\u2019t even hope to compete with the best<br \/>\npattern finder in the world for this kind of task: the human brain. This is easy for\u00a0people! So why did the thousands of Googlers who saw these unlabeled photos miss the \u201c<em>Cat 1 versus Cat 2<\/em>\u201d answer?<\/p>\n<blockquote id=\"c67c\"><p>Think of unsupervised learning as a sort of mathematical version of making \u201cbirds of a feather flock together.\u201d<\/p><\/blockquote>\n<p id=\"31da\">Just because something\u2019s interesting to me doesn\u2019t mean my pattern finder will\u00a0find\u00a0it. Even if the pattern finder is awesome, I didn\u2019t tell it what I\u2019m looking\u00a0for, so why would I expect my\u00a0<a href=\"https:\/\/www.youtube.com\/watch?v=iLu9XyZ55oI&amp;feature=youtu.be&amp;t=0h5m12s\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.youtube.com\/watch?v=iLu9XyZ55oI&amp;feature=youtu.be&amp;t=0h5m12s\" data->learning algorithm<\/a>\u00a0to deliver it? This isn\u2019t magic! If I don\u2019t tell it what the right answers\u00a0are&#8230; I get what I get and I don\u2019t get upset. All I can do is look at the clusters the system returns for me and see if I find them inspiring. If I don\u2019t\u00a0like\u00a0&#8217;em, I just run a different unsupervised algorithm (\u201cSomeone else in the audience, split them for me a different way\u201d) over and over until something feels interesting.<\/p>\n<blockquote id=\"68f3\"><p>The results are a Rorschach card to help you\u00a0dream.<\/p><\/blockquote>\n<p id=\"f54a\">There\u2019s no guarantee that anything inspiring will come out of the process, but it doesn\u2019t hurt to try. Exploring the unknown is supposed to be a bit of an adventure, after all. Have fun with it!<\/p>\n<p id=\"9869\">In future episodes, we\u2019ll look at cautionary tales of what can go wrong if you forget that the labels are just an inspiration and shouldn\u2019t be taken too seriously, let alone treated as human-interpretable. (Hint: there may be mention of finding Elvis\u00a0in\u00a0a\u00a0slice\u00a0of\u00a0toast.) They\u2019re just there to give you ideas about what you might like to dive into next.<\/p>\n<p id=\"dc23\"><strong><em>Summary:<\/em><\/strong>\u00a0Unsupervised learning helps you find inspiration in data by grouping similar things together for you. There are many different ways of defining similarity, so keep trying algorithms and settings until a cool pattern catches your eye.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Unsupervised learning may sound like a fancy way to say &ldquo;let the kids learn on their own not to touch the hot oven&rdquo;&nbsp;but it&rsquo;s actually a pattern-finding technique for mining inspiration from your data. It has nothing to do with machines running around without adult supervision, forming their own opinions about things. &nbsp;Unsupervised learning helps you find inspiration in data by grouping similar things together for you. There are many different ways of defining similarity, so keep trying algorithms and settings until a cool pattern catches your eye. Let&rsquo;s demystify!<\/p>\n","protected":false},"author":335,"featured_media":2904,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[187],"tags":[94],"ppma_author":[2050],"class_list":["post-893","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-bigdata-cloud","tag-data-science"],"authors":[{"term_id":2050,"user_id":335,"is_guest":0,"slug":"cassie-kozyrkov","display_name":"Cassie Kozyrkov","avatar_url":"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2020\/04\/medium_df35f80d-2bff-4fe3-b741-a94d51320e00-150x150.jpg","author_category":"","user_url":"https:\/\/careers.google.com\/?src=Online\/LinkedIn\/linkedin_profilepage&amp;utm_source","last_name":"Kozyrkov","first_name":"Cassie","job_title":"","description":"Cassie Kozyrkov is Chief Decision Scientist at Google, Inc. With a unique combination of deep technical expertise, and world-class public-speaking skills, she has provided guidance on more than 100 projects and designed Google's analytics program, personally training over 15000 Googlers in statistics, decision-making, and machine learning.\u00a0"}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/893","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/335"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=893"}],"version-history":[{"count":0,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/893\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/2904"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=893"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=893"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=893"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=893"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}