{"id":975,"date":"2018-11-12T02:54:23","date_gmt":"2018-11-11T23:54:23","guid":{"rendered":"http:\/\/kusuaks7\/?p=580"},"modified":"2021-12-15T05:21:25","modified_gmt":"2021-12-15T05:21:25","slug":"ways-in-which-machines-learn","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/ai-ml\/ways-in-which-machines-learn\/","title":{"rendered":"Ways in Which Machines Learn"},"content":{"rendered":"<p><strong><em>Ready to learn Machine Learning? Browse<\/em><\/strong> <strong><em><a href=\"https:\/\/www.experfy.com\/training\/tracks\/machine-learning-training-certification\">Machine Learning Training and Certification courses<\/a> developed by industry thought leaders and Experfy in Harvard Innovation Lab.<\/em><\/strong><\/p>\n<p id=\"96aa\">There are four major ways to train deep learning networks:\u00a0<em>supervised<\/em>,<em>unsupervised<\/em>,\u00a0<em>semi-supervised<\/em>, and\u00a0<em>reinforcement learning<\/em>.\u00a0We\u2019ll explain the intuitions behind each of the these methods. Along the way, we\u2019ll share terms you\u2019ll read in the literature in parentheses and point to more resources for the mathematically inclined. By the way, these categories span both traditional machine learning algorithms and the newer, fancier deep learning algorithms.<\/p>\n<p id=\"cd3a\">For the math-inclined, see\u00a0<a href=\"http:\/\/ufldl.stanford.edu\/tutorial\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/ufldl.stanford.edu\/tutorial\/\" data->this Stanford tutorial which covers supervised and unsupervised learning<\/a>\u00a0and includes code samples.<\/p>\n<h3 id=\"1c43\">Supervised Learning<\/h3>\n<p id=\"537a\"><strong>Supervised Learning<\/strong>\u00a0trains networks using examples where we already know the correct answer. Imagine we are interested in training a network to recognize pictures from your photo library that have your parents in them. Here\u2019s the steps we\u2019d take in that hypothetical scenario.<\/p>\n<h3 id=\"c4f8\">Step 1: Data Set Creation and Categorization<\/h3>\n<p id=\"339a\">We would start the process by going through your photos (the data set) and identifying all the pictures that have your parents in them, labeling them. We would then take the whole stack of photos and split them into two piles. We would use the first pile to train the network (training data) and the second pile to see how accurate the model is at picking out photos with our parents (validation data).<\/p>\n<p id=\"edb8\">Once the data sets are ready, we\u2019d feed the photos to the model. Mathematically, our goal is for the deep network to find a function whose input is a photo and whose output is a 0 when your parents are not in the photo or 1 when they are.<\/p>\n<p id=\"af06\">This step is usually called the\u00a0<em>categorization task<\/em>. In this case we\u2019re training for results that are yes-no, but supervised learning can also be used to output a set of values, rather than just a 0 or 1. For example, we might train a network to output the probability that someone will repay a credit card loan, in which case the output is anywhere between 0 and 100. These tasks are called regressions.<\/p>\n<h3 id=\"1b52\">Step 2:\u00a0Training<\/h3>\n<p id=\"6294\">To continue the process, the model makes a prediction for each photo by following rules (activation function) to decide whether to light up a particular node in the work. The model works from left to right one layer a time \u2014 we will ignore more complicated networks for the moment. After the network calculates this for every node in the network, we\u2019ll get to the rightmost node (output node) which lights up or not.<\/p>\n<p id=\"b2e2\">Since we already know which pictures have your parents in them, we would be able to tell the model whether its prediction is right or wrong. We would then\u00a0<em>feed back<\/em>\u00a0this information to the network.<\/p>\n<p id=\"d5ed\">The algorithm uses this feedback, which is the result of a function that quantifies \u201chow far off from the real answer is from the model\u2019s prediction\u201d. This is called a\u00a0<em>cost function<\/em>, also known as\u00a0<em>objective function<\/em>,\u00a0<em>utility function<\/em>or\u00a0<em>fitness function<\/em>. The result of the function is then used to modify the strength of connections and biases between nodes in a process called\u00a0<em>backpropagation<\/em>\u00a0since the information travels \u201cbackwards\u201d from the result nodes.<\/p>\n<p id=\"d227\">We\u2019d repeat this for each of the pictures, and in each case the algorithms try to minimize the cost function.<\/p>\n<p id=\"b1ad\">There are a variety of mathematical techniques to use this knowledge of whether the model was right or wrong back into the model, but a very common method is gradient descent.\u00a0<a href=\"https:\/\/algobeans.com\/2016\/11\/03\/artificial-neural-networks-intro2\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/algobeans.com\/2016\/11\/03\/artificial-neural-networks-intro2\/\" data->Algobeans<\/a>\u00a0has a good layman\u2019s explanation of how this works. Michael Nielsen\u00a0<a href=\"http:\/\/neuralnetworksanddeeplearning.com\/chap2.html\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/neuralnetworksanddeeplearning.com\/chap2.html\" data->adds the math<\/a>\u00a0which involves calculus and linear algebra (and a friendly demon!).<\/p>\n<h3 id=\"8cf3\">Step 3:\u00a0Verify<\/h3>\n<p id=\"6e2e\">Once we\u2019ve processed all the photos from our first stack we will be ready to test the model. We would grab the second stack of photos and use them to see how accurately the trained model can pick up photos of your parents.<\/p>\n<p id=\"12c4\">Steps 2 and 3 would typically be repeated by tweaking various things about the model (hyperparameters), such as how many nodes there are, how many layers there are, which mathematical function to use to decide whether a node lights up, how aggressively to train the weights during the backpropagation phase, and so on. This\u00a0<a href=\"https:\/\/www.quora.com\/What-are-hyperparameters-in-machine-learning\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.quora.com\/What-are-hyperparameters-in-machine-learning\" data->Quora answer<\/a>\u00a0has a good explanation of the knobs you can turn.<\/p>\n<h3 id=\"53e1\">Step 4:\u00a0Use<\/h3>\n<p id=\"2973\">Finally, once you have an accurate model, you deploy that model into your application. Your expose the model as an API call, such as\u00a0<code>ParentsInPicture(photo)<\/code>, and you can call that method from your software, causing the model to make an inference and giving you the result.<\/p>\n<p id=\"40ed\">We\u2019ll go through this exact process later to write an iPhone application that recognizes business cards.<\/p>\n<p id=\"e4ca\">It can be hard (that is, expensive) to get a labeled data set, so you need to make sure the value of the prediction justifies the cost of getting the labeled data and training the model in the first place. For example, getting labeled X-rays of people who might have cancer is expensive, but the value of an accurate model that generates few false positives and few false negatives is obviously very high.<\/p>\n<h3 id=\"ea85\">Unsupervised Learning<\/h3>\n<p id=\"1bcb\"><strong>Unsupervised Learning<\/strong>\u00a0is for situations where you have a data set but no labels. Unsupervised learning takes the input set and tries to find patterns in the data, for instance by organizing them into groups (clustering) or finding outliers (anomaly detection). For example:<\/p>\n<ul>\n<li id=\"51a8\">Imagine you are a T-shirt manufacturer, and you have a bunch of people\u2019s body measurements. You\u2019d like a clustering algorithm that groups those measurements into a set of clusters so you can decide how big to make your XS, S, M, L, and XL shirts.<\/li>\n<li id=\"2858\">You are the CTO of a security startup and you want to find anomalies in the history of network connections between computers: network traffic that looks unusual might help you find an employee downloading all their CRM history because they are about to quit or someone transferring an abnormally large amount of money to a new bank account. If you\u2019re interested in this sort of thing, you\u2019ll like this\u00a0<a href=\"http:\/\/journals.plos.org\/plosone\/article?id=10.1371\/journal.pone.0152173\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/journals.plos.org\/plosone\/article?id=10.1371\/journal.pone.0152173\" data->survey of unsupervised anomaly detection algorithms<\/a>.<\/li>\n<li id=\"b971\">You are on the Google Brain team, and you wonder what\u2019s in YouTube videos. This is the very real story of the \u201cYouTube cat finder\u201d research that\u00a0<a href=\"https:\/\/www.wired.com\/2012\/06\/google-x-neural-network\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.wired.com\/2012\/06\/google-x-neural-network\/\" data->kindled the general public\u2019s enthusiasm for AI<\/a>. In\u00a0<a href=\"https:\/\/arxiv.org\/abs\/1112.6209\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/arxiv.org\/abs\/1112.6209\" data->this paper<\/a>, the Google Brain team in conjunction with Stanford researchers Quoc Le and Andrew Ng describe an algorithm that groups YouTube videos into a bunch of categories, including one that contained cats. They didn\u2019t set out to find cats, but the algorithm automatically grouped videos containing cats (and thousands of other objects from the 22,000 object categories defined in ImageNet) together without any explicit training data.<\/li>\n<\/ul>\n<p id=\"8b0d\">Some unsupervised learning techniques you\u2019ll read about in the literature include:<\/p>\n<ul>\n<li id=\"ffce\"><a href=\"http:\/\/ufldl.stanford.edu\/tutorial\/unsupervised\/Autoencoders\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/ufldl.stanford.edu\/tutorial\/unsupervised\/Autoencoders\/\" data->Autoencoding<\/a><\/li>\n<li id=\"ff87\"><a href=\"https:\/\/www.quora.com\/What-is-an-intuitive-explanation-for-PCA\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.quora.com\/What-is-an-intuitive-explanation-for-PCA\" data->Principal components analysis<\/a><\/li>\n<li id=\"36a2\"><a href=\"https:\/\/en.wikipedia.org\/wiki\/Random_forest\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/en.wikipedia.org\/wiki\/Random_forest\" data->Random forests<\/a><\/li>\n<li id=\"cd92\"><a href=\"https:\/\/www.youtube.com\/watch?v=RD0nNK51Fp8\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.youtube.com\/watch?v=RD0nNK51Fp8\" data->K-means clustering<\/a><\/li>\n<\/ul>\n<p id=\"5c97\">To learn more about unsupervised learning, try\u00a0<a href=\"https:\/\/www.udacity.com\/course\/machine-learning-unsupervised-learning--ud741\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.udacity.com\/course\/machine-learning-unsupervised-learning--ud741\" data->this Udacity class<\/a>.<\/p>\n<p id=\"9a52\">One of the\u00a0<a href=\"https:\/\/www.quora.com\/What-are-some-recent-and-potentially-upcoming-breakthroughs-in-deep-learning\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.quora.com\/What-are-some-recent-and-potentially-upcoming-breakthroughs-in-deep-learning\" data->most promising recent developments in unsupervised learning<\/a>\u00a0is an idea from Ian Goodfellow (who was working in Yoshua Bengio\u2019s lab at the time) called \u201cgenerative adversarial networks\u201d in which we pit two neural networks against each other: one network, called the generator is responsible for generating data designed to try and trick the other network, called the discriminator. This approach is achieving some amazing results, such as AI which can generate photo-realistic pictures from\u00a0<a href=\"https:\/\/arxiv.org\/abs\/1612.03242\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/arxiv.org\/abs\/1612.03242\" data->text strings<\/a>\u00a0or\u00a0<a href=\"https:\/\/arxiv.org\/pdf\/1611.07004v1.pdf\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/arxiv.org\/pdf\/1611.07004v1.pdf\" data->hand-drawn sketches<\/a>.<\/p>\n<h3 id=\"b147\">Semi-supervised Learning<\/h3>\n<p id=\"54c5\"><strong>Semi-supervised learning<\/strong>\u00a0combines a lot of unlabeled data with a small amount of labeled data during the training phase. The trained models that result from this training set can be highly accurate and less expensive to train compared to using all labeled data. Our friend Delip Rao at the AI consulting company\u00a0<a href=\"http:\/\/joostware.com\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/joostware.com\/\" data->Joostware<\/a>, for example, built a solution using semi-supervised learning using just 30 labels per class which got the same accuracy as a model trained using supervised learning which required ~1360 labels per class. This enabled their client to scale their prediction capabilities from 20 categories to 110 categories very quickly.<\/p>\n<p id=\"7e49\">One intuition behind why using unlabeled data can sometimes help make models more accurate: even if you don\u2019t know the answer, you are learning something about what the possible values are and how often specific values appear.<\/p>\n<p id=\"4f8d\">Math fans: try this Xiaojin Zhu\u2019s\u00a0<a href=\"http:\/\/pages.cs.wisc.edu\/~jerryzhu\/pub\/sslicml07.pdf\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/pages.cs.wisc.edu\/~jerryzhu\/pub\/sslicml07.pdf\" data->epic 135-slide tutorial<\/a>\u00a0and the\u00a0<a href=\"http:\/\/pages.cs.wisc.edu\/~jerryzhu\/pub\/ssl_survey.pdf\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/pages.cs.wisc.edu\/~jerryzhu\/pub\/ssl_survey.pdf\" data->accompanying paper which surveys the literature back in 2008<\/a>.<\/p>\n<h3 id=\"b988\">Reinforcement Learning<\/h3>\n<p id=\"862b\"><strong>Reinforcement learning<\/strong>\u00a0is for situations where you again don\u2019t have labeled data sets, but you do have a way to telling whether you are getting closer to your goal (reward function). The classic children\u2019s game hotter or colder (a variant of\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Huckle_buckle_beanstalk\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/en.wikipedia.org\/wiki\/Huckle_buckle_beanstalk\" data->Huckle Buckle Beanstalk<\/a>) is a good illustration of the concept. Your job is to find a hidden object, and your friends will call out whether you are getting \u201chotter\u201d (closer to) or \u201ccolder\u201d (farther from) the object. \u201cHotter\/colder\u201d is the reward function, and the goal of the algorithm is to maximize the reward function. You can think of the reward function is a delayed and sparse form of labeled data: rather than getting a specific \u201cright\/wrong\u201d answer with each data point, you\u2019ll get a delayed reaction and only a hint of whether you\u2019re heading in the right direction.<\/p>\n<ul>\n<li id=\"d758\">DeepMind\u00a0<a href=\"https:\/\/deepmind.com\/blog\/deep-reinforcement-learning\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/deepmind.com\/blog\/deep-reinforcement-learning\/\" data->published a paper in Nature<\/a>\u00a0describing a system that combines reinforcement learning with deep learning to learned to play a set of Atari video games, some with great success (like Breakout) and others terribly (like Montezuma\u2019s Revenge).<\/li>\n<li id=\"bdde\">The Nervana team (now at Intel) published\u00a0<a href=\"https:\/\/www.nervanasys.com\/demystifying-deep-reinforcement-learning\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.nervanasys.com\/demystifying-deep-reinforcement-learning\/\" data->an excellent explanatory blog post<\/a>\u00a0that walks through the techniques in detail.<\/li>\n<li id=\"fd02\">A very creative Stanford student project by Russell Kaplan, Christopher Sauer, Alexander Sosa illustrates one of the challenges with reinforcement learning and suggests a clever solution. You\u2019ll see in the DeepMind paper that the algorithms failed to learn how to play Montezuma\u2019s Revenge. The reason for this is that, as the Stanford students describe, \u201creinforcement learning agents still struggle to learn in environments with sparse rewards.\u201d When you don\u2019t get enough \u201chotter\u201d or \u201ccolder\u201d hints, you have a hard time finding the hidden key. The Stanford students basically taught the system to understand and respond to natural language hints such as \u201cclimb down the ladder\u201d or \u201cget the key\u201d, making the system the top-scoring algorithm in the OpenAI gym. Watch a\u00a0<a href=\"https:\/\/drive.google.com\/file\/d\/0B2ZTvWzKa5PHSkJvQVlsb0FLYzQ\/view\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/drive.google.com\/file\/d\/0B2ZTvWzKa5PHSkJvQVlsb0FLYzQ\/view\" data->video of the algorithm in action<\/a>.<\/li>\n<\/ul>\n<figure id=\"7654\"><canvas width=\"75\" height=\"35\"><\/canvas><img decoding=\"async\" style=\"width: 624px; height: 305px;\" src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/0*tDksfE3n84GUvASl.png\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/0*tDksfE3n84GUvASl.png\" \/><\/figure>\n<ul>\n<li id=\"ed1e\">Watch the video of a reinforcement learning system that\u00a0<a href=\"https:\/\/www.youtube.com\/watch?v=L4KBBAwF_bE\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.youtube.com\/watch?v=L4KBBAwF_bE\" data->learned to play Super Mario World like a boss<\/a>.<\/li>\n<\/ul>\n<p id=\"b55c\">Richard Sutton and Andrew Barto\u00a0wrote the book on Reinforcement Learning. Check out the\u00a0<a href=\"http:\/\/incompleteideas.net\/sutton\/book\/the-book-2nd.html\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/incompleteideas.net\/sutton\/book\/the-book-2nd.html\" data->draft of the 2nd edition<\/a>.<\/p>\n<blockquote id=\"ea93\"><p>Originally published in\u00a0<a href=\"http:\/\/a16z.com\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/a16z.com\/\" data->Andreessen Horowitz<\/a>\u2019s\u00a0<a href=\"http:\/\/aiplaybook.a16z.com\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/aiplaybook.a16z.com\/\" data->AI Playbook<\/a>.<\/p><\/blockquote>\n","protected":false},"excerpt":{"rendered":"<p>There are four major ways to train deep learning networks:&nbsp;supervised, unsupervised,&nbsp;semi-supervised, and&nbsp;reinforcement learning.&nbsp;We&rsquo;ll explain the intuitions behind each of these methods. Along the way, we&rsquo;ll share terms you&rsquo;ll read in the literature in parentheses and point to more resources for the mathematically inclined. By the way, these categories span both traditional machine learning algorithms and the newer, fancier deep learning algorithms.<\/p>\n","protected":false},"author":390,"featured_media":3443,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"content-type":"","footnotes":""},"categories":[183],"tags":[97],"ppma_author":[2195],"class_list":["post-975","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-ml","tag-artificial-intelligence"],"authors":[{"term_id":2195,"user_id":390,"is_guest":0,"slug":"frank-chen","display_name":"Frank Chen","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/?s=96&d=mm&r=g","user_url":"","last_name":"Chen","first_name":"Frank","job_title":"","description":"Frank Chen&nbsp;is Partner at Andreessen Horowitz, a venture firm helping entrepreneurs in Venture capital, artificial intelligence\/machine learning, fundraising, and product planning."}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/975","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/390"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=975"}],"version-history":[{"count":2,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/975\/revisions"}],"predecessor-version":[{"id":28396,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/975\/revisions\/28396"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/3443"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=975"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=975"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=975"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=975"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}