{"id":1421,"date":"2019-02-15T10:32:09","date_gmt":"2019-02-15T10:32:09","guid":{"rendered":"http:\/\/kusuaks7\/?p=1026"},"modified":"2023-06-28T15:40:42","modified_gmt":"2023-06-28T15:40:42","slug":"learning-ai-if-you-suck-at-math-part7-the-magic-of-natural-language-processing","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/ai-ml\/learning-ai-if-you-suck-at-math-part7-the-magic-of-natural-language-processing\/","title":{"rendered":"Learning AI if You Suck at Math \u200a- \u200aPart 7 \u200a- \u200aThe Magic of Natural Language Processing"},"content":{"rendered":"<p><strong><em>Ready to learn Artificial Intelligence? <a href=\"https:\/\/www.experfy.com\/training\/courses\">Browse courses<\/a>\u00a0like\u00a0 <a href=\"https:\/\/www.experfy.com\/training\/courses\/uncertain-knowledge-and-reasoning-in-artificial-intelligence\">Uncertain Knowledge and Reasoning in Artificial Intelligence<\/a> developed by industry thought leaders and Experfy in Harvard Innovation Lab.<\/em><\/strong><\/p>\n<p id=\"9a84\">After discovering\u00a0<a href=\"https:\/\/hackernoon.com\/learning-ai-if-you-suck-at-math-p5-deep-learning-and-convolutional-neural-nets-in-plain-english-cda79679bbe3#.7ci7zh7v3\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/hackernoon.com\/learning-ai-if-you-suck-at-math-p5-deep-learning-and-convolutional-neural-nets-in-plain-english-cda79679bbe3#.7ci7zh7v3\" data->the amazing power of convolutional neural networks for image recognition<\/a>\u00a0in part five of this series, I decided to dive head first into\u00a0<a href=\"http:\/\/blog.algorithmia.com\/introduction-natural-language-processing-nlp\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/blog.algorithmia.com\/introduction-natural-language-processing-nlp\/\" data->Natural language Processing or NLP<\/a>. (If you missed the earlier articles, be sure to check them out:\u00a0<a href=\"https:\/\/www.experfy.com\/blog\/learning-ai-if-you-suck-at-math-part-1\">part 1<\/a>,\u00a0<a href=\"https:\/\/www.experfy.com\/blog\/learning-ai-if-you-suck-at-math-part-two-practical-projects\">part 2<\/a>,\u00a0<a href=\"https:\/\/www.experfy.com\/blog\/learning-ai-if-you-suck-at-math-part3-building-an-ai-dream-machine\">part3<\/a>, <a href=\"https:\/\/www.experfy.com\/blog\/learning-ai-if-you-suck-at-math-part4-tensors-illustrated-with-cats\">part4<\/a>, <a href=\"https:\/\/www.experfy.com\/blog\/learning-ai-if-you-suck-at-math-part5-deep-learning-and-convolutional-neural-nets-in-plain-english\">part5<\/a>, and <a href=\"https:\/\/www.experfy.com\/blog\/learning-ai-if-you-suck-at-math-part6-math-notation-made-easy\">part6<\/a>.)<\/p>\n<p id=\"fb6b\">This hotbed of machine learning research teaches computers to understand how people talk. When you ask Siri or the\u00a0<a href=\"https:\/\/assistant.google.com\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/assistant.google.com\/\" data->Google Assistant<\/a>\u00a0a question, it\u2019s NLP that drives the conversation. Of course, as an author of novels and articles, working with language seemed like the obvious next step for me.<\/p>\n<p id=\"d342\"><strong>I may suck at math but words are my domain!<\/strong><\/p>\n<p id=\"9a78\">So I set out to uncover what insights NLP could give me about my own area of mastery.<\/p>\n<figure id=\"764e\" data-scroll=\"native\"><canvas width=\"75\" height=\"69\"><\/canvas><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/540\/1*nAXkV-KG2rXtEX-lcW4fAA.jpeg\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/540\/1*nAXkV-KG2rXtEX-lcW4fAA.jpeg\" \/><\/figure>\n<p style=\"text-align: center;\">Behold the\u00a0Bard!<\/p>\n<p id=\"a2df\">I had so many questions. Had NLP uncovered the hidden keys to writing heart-wrenching poems? Could AIs turn phrases better than\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/William_Shakespeare\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/en.wikipedia.org\/wiki\/William_Shakespeare\" data->the Bard<\/a>? Could they elucidate the secret to writing\u00a0<a href=\"https:\/\/larseidnes.com\/2015\/10\/13\/auto-generating-clickbait-with-recurrent-neural-networks\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/larseidnes.com\/2015\/10\/13\/auto-generating-clickbait-with-recurrent-neural-networks\/\" data->compulsively clickable headlines<\/a>?<\/p>\n<p id=\"3316\">Luckily, I had just the right project in mind to test the limits of NLP. I was in the midst of naming the second book in my epic sci-fi saga\u00a0<a href=\"http:\/\/meuploads.com\/the-jasmine-wars-2\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/meuploads.com\/the-jasmine-wars-2\/\" data->The Jasmine Wars<\/a>\u00a0but I\u2019d struggled to find the perfect title. So I wondered:<\/p>\n<p id=\"c3eb\"><strong>What if I could feed a neural net with the greatest titles of all time and have it deliver a title for the ages?<\/strong><\/p>\n<p id=\"79ad\">This isn\u2019t my first foray into computer assisted title generation. There are\u00a0<a href=\"http:\/\/www.fantasynamegenerators.com\/book-title-generator.php#.WL9Yyn9uNOI\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/www.fantasynamegenerators.com\/book-title-generator.php#.WL9Yyn9uNOI\" data->a number of random title generators<\/a>\u00a0<a href=\"http:\/\/www.kitt.net\/php\/title.php\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/www.kitt.net\/php\/title.php\" data->out on the interwebs<\/a>\u00a0<a href=\"http:\/\/www.mcoorlim.com\/random.html\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/www.mcoorlim.com\/random.html\" data->that I\u2019ve tried<\/a>\u00a0from time to time.<\/p>\n<p id=\"ab3f\">Frankly, they\u2019re not very good.<\/p>\n<p id=\"187b\">They\u2019re the type of toy you play with for a few minutes and then move on. They work by randomly slamming words together or by iterating through a few basic permutations like \u201cThe _______ of _________.\u201d I seriously doubt a single author actually selected his or her title from the primordial word soup these engines produce.<\/p>\n<p id=\"bf6e\">Throwing words into a hat, shaking it up and pulling them out won\u2019t get you very far. A million monkeys typing randomly on keyboards might make Shakespeare in a million years, but I don\u2019t have that kind of time.<\/p>\n<p id=\"23be\">AI to the rescue!<\/p>\n<h3 id=\"22c3\">Networks that Peer into the Depths of\u00a0Time<\/h3>\n<p id=\"b0d4\">As we learned in\u00a0<a href=\"https:\/\/hackernoon.com\/learning-ai-if-you-suck-at-math-p5-deep-learning-and-convolutional-neural-nets-in-plain-english-cda79679bbe3#.7ci7zh7v3\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/hackernoon.com\/learning-ai-if-you-suck-at-math-p5-deep-learning-and-convolutional-neural-nets-in-plain-english-cda79679bbe3#.7ci7zh7v3\" data->part five<\/a>, neural networks hold amazing power because they do\u00a0<em>automatic feature extraction<\/em>. We can\u2019t tell a machine all the steps we take to drive a car but we can let it figure it out all by itself!<\/p>\n<p id=\"972e\">As an author I use all kinds of tricks to capture people\u2019s attention but trying to boil those down to a set of rules is virtually impossible. It goes well beyond simply understanding nouns, verbs and adjectives. There\u2019s a rhythm to language. Words can spark fiery images in your mind. They can overwhelm you with emotion, making you break down with tears or get you quivering with anticipation. They create sound and fury, movement and feeling.<\/p>\n<p id=\"8276\">Can a machine do all of that?<\/p>\n<p id=\"079e\">Am I on the chopping block of automation?<\/p>\n<p id=\"0214\">Will AI make writers redundant in the future?<\/p>\n<p id=\"9756\">To find out, I first needed to figure out what kind of neural network (NN) I needed. NNs are very specific to the problem they\u2019re trying to solve. Humans might posses a\u00a0<a href=\"http:\/\/amzn.to\/2nD9Mg2\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/amzn.to\/2nD9Mg2\" data->universal learning algorithm<\/a>\u00a0but we certainly don\u2019t know it yet. The current state of the art focuses on \u201cnarrow AI\u201d with each neural net doing some things well and other things really badly.<\/p>\n<p id=\"8e8e\"><strong>So what kind of NN helps us understand language?<\/strong><\/p>\n<p id=\"67c5\"><strong>Hands down the dominant force behind NLP are Recurrent Neural Networks (RNN), in particular Long Short Term Memory (LSTM) RNNs.<\/strong><\/p>\n<p id=\"f20e\">So let\u2019s take a look at these and see if they can help me unlock the secrets of blockbuster title creation.<\/p>\n<h3 id=\"f696\"><strong>The Magic of Recurrent Neural\u00a0Nets<\/strong><\/h3>\n<p id=\"412b\">Inevitably, when you start looking into RNN\u2019s you discover OpenAI researcher Andrej Karpahy\u2019s blog \u201c<a href=\"http:\/\/karpathy.github.io\/2015\/05\/21\/rnn-effectiveness\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/karpathy.github.io\/2015\/05\/21\/rnn-effectiveness\/\" data->The Unreasonable Effectiveness of Recurrent Neural Networks<\/a>.\u201d The title alone filled me with tremendous hope.<\/p>\n<p id=\"7566\">Just how unreasonably effective are these amazing systems?<\/p>\n<p id=\"4d10\">If the title didn\u2019t get me, the first line surely did:<\/p>\n<p id=\"7d5a\">\u201cThere\u2019s something magical about Recurrent Neural Networks?\u201d<\/p>\n<p id=\"13c6\">Magical!<\/p>\n<p id=\"2ff6\">I knew a fantastic title couldn\u2019t be far off, its supernatural power already swirling in the hidden depths of the matrix.<\/p>\n<p id=\"3548\">So what makes RNN\u2019s \u201cmagical?\u201d First, they\u2019re particularly adept at predicting the future.<\/p>\n<figure id=\"b24c\" data-scroll=\"native\"><canvas width=\"27\" height=\"75\"><\/canvas><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/540\/1*t_OZqG9ZTpCEYanYUIcCkQ.jpeg\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/540\/1*t_OZqG9ZTpCEYanYUIcCkQ.jpeg\" \/><\/figure>\n<p id=\"86cd\">When you buy a stock or pick someone up at the airport, you\u2019re making a guess about the future. A baseball player trying to snag a fly ball has to predict the arc of the ball and leap to where it\u2019s going to catch it.<\/p>\n<p id=\"5309\">We make predictions all the time, whether we\u2019re weaving our way through big city foot traffic or driving a car.<\/p>\n<p id=\"59aa\">Are those other cars going to hit you?<\/p>\n<p id=\"8f5e\">Is someone veering into your lane?<\/p>\n<p id=\"b36e\">Where will your friend be waiting for you at the airport?<\/p>\n<p id=\"9e39\"><strong>We\u2019re constantly trying to predict what happens next and react to it ahead of time so we\u2019re ready. RNN\u2019s do the same thing by analyzing time series data<\/strong>.<\/p>\n<p style=\"text-align: center;\">\n<p style=\"text-align: center;\">\n<p id=\"d2f0\">They can look forward and unlike most other NN\u2019s they can look back too. They have a \u201cmemory\u201d of events past. They can see the trajectory of a rocket or a stock price move and predict a buy or sell. When it comes to self-driving cars they can predict trajectories and arcs, which means they can help prevent accidents (as you see in the footage of a Tesla chiming a warning before a crash) or when to take an off ramp.<\/p>\n<p id=\"1a23\">They\u2019re also good at working with sequences of arbitrary length. That\u2019s unique because most NN\u2019s can only take fixed size vectors\/tensors and output fixed size vectors. With convolutional neural nets we have to munge our images into a certain shape to make them work. But that\u2019s a no go for text. You can\u2019t take a novel or the entirety of Wikipedia and jam it into a one size fits all box. This flexibility makes RNNs great for NLP, which encompasses everything from\u00a0<a href=\"https:\/\/www.nytimes.com\/2016\/12\/14\/magazine\/the-great-ai-awakening.html\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.nytimes.com\/2016\/12\/14\/magazine\/the-great-ai-awakening.html\" data->machine translation<\/a>\u00a0to\u00a0<a href=\"http:\/\/text-processing.com\/demo\/sentiment\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/text-processing.com\/demo\/sentiment\/\" data->sentiment analysis<\/a>\u00a0to\u00a0<a href=\"https:\/\/assistant.google.com\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/assistant.google.com\/\" data->Google<\/a>\u2019<a href=\"https:\/\/assistant.google.com\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/assistant.google.com\/\" data->s Pixel AI<\/a>understanding your questions.<\/p>\n<p id=\"e9b3\">This even gives RNN\u2019s a level of \u201ccreativity.\u201d Check out\u00a0<a href=\"http:\/\/www.hexahedria.com\/2015\/08\/03\/composing-music-with-recurrent-neural-networks\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/www.hexahedria.com\/2015\/08\/03\/composing-music-with-recurrent-neural-networks\/\" data->this article on generating music with RNNs<\/a>. The system is trained on a series of musical sequences and \u201cmakes\u201d music by predicting the likely next sequences.<\/p>\n<p id=\"8221\">What else can RNNs do?<\/p>\n<p id=\"6e06\">If we were doing sentiment analysis, aka trying trying to figure out if people are feeling good or bad about something, then we could feed it movie reviews and have it output a binary classification score from love (1) to hate (-1).<\/p>\n<figure id=\"1a3a\" data-scroll=\"native\"><canvas width=\"75\" height=\"75\"><\/canvas><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/540\/1*2tOHRjirYAvMf2BAO2TQ5w.png\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/540\/1*2tOHRjirYAvMf2BAO2TQ5w.png\" \/><\/figure>\n<p id=\"ebd6\">You could also feed it a\u00a0<em>single input<\/em>\u00a0and have it deliver a\u00a0<em>series of outputs<\/em>. For example, we could feed the network an image (single input) and have it generate a text summary of images (series of outputs).\u00a0<a href=\"http:\/\/cs.stanford.edu\/people\/karpathy\/deepimagesent\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/cs.stanford.edu\/people\/karpathy\/deepimagesent\/\" data->Check out this article that shows how RNNs look at pictures and deliver summaries like \u201cBoy doing a backflip off a wakeboard.\u201d<\/a><\/p>\n<p id=\"fe1a\">The system locates objects and tries to create a sentence from that, like we see with this gal playing tennis.<\/p>\n<figure id=\"8671\"><canvas width=\"75\" height=\"63\"><\/canvas><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/720\/1*9B_usMGpVpgUr48c4xjMjw.png\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/720\/1*9B_usMGpVpgUr48c4xjMjw.png\" \/><\/figure>\n<p id=\"b0d9\">We could also feed it a\u00a0<em>sequence to vector network, called an encoder<\/em>\u00a0and output the reverse,\u00a0<em>a vector to sequence, which we call a decoder<\/em>.<\/p>\n<figure id=\"072d\" data-scroll=\"native\"><canvas width=\"75\" height=\"41\"><\/canvas><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/540\/1*5-dYhA1CD7j57EJfJyp3gw.png\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/540\/1*5-dYhA1CD7j57EJfJyp3gw.png\" \/><\/figure>\n<p id=\"9ff3\">This is useful for machine translation (<a href=\"https:\/\/medium.com\/@ageitgey\/machine-learning-is-fun-part-5-language-translation-with-deep-learning-and-the-magic-of-sequences-2ace0acca0aa\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/medium.com\/@ageitgey\/machine-learning-is-fun-part-5-language-translation-with-deep-learning-and-the-magic-of-sequences-2ace0acca0aa\" data->as seen in this awesome article from the Machine Learning is Fun dude<\/a>).\u00a0<a href=\"https:\/\/www.nytimes.com\/2016\/12\/14\/magazine\/the-great-ai-awakening.html?_r=0\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.nytimes.com\/2016\/12\/14\/magazine\/the-great-ai-awakening.html?_r=0\" data->Google recently used RNNs to revamp their Google Translate system<\/a>\u00a0into something that blows away the gobbledygook translations of previous versions, delivering human level translation capabilities. And on April 11, 2017,\u00a0<a href=\"https:\/\/research.googleblog.com\/2017\/04\/introducing-tf-seq2seq-open-source.html?m=1\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/research.googleblog.com\/2017\/04\/introducing-tf-seq2seq-open-source.html?m=1\" data->Google finally open sourced the model behind their translation engine, called tf-seq2seq<\/a>\u00a0(not the best marketing name but hey, we\u2019ll take it.) Basically what we do is train the network with both the original text\u00a0<em>and<\/em>\u00a0a professionally translated text in another language (the input and output) and then use that to help the machine translate fresh documents it\u2019s never seen.<\/p>\n<p id=\"ece2\">But for our purposes we need one very specific feature of RNN\u2019s:<\/p>\n<p id=\"baf7\"><strong>They\u2019re great at generating text.<\/strong><\/p>\n<p id=\"ddeb\">It\u2019s also what Karpahy\u2019s \u201cmagical RNN\u201d blog is about and what got me interested.<\/p>\n<p id=\"7c9d\">Let\u2019s take a quick look at how RNN\u2019s work and then leap into how I attempted to use them to generate my next great American novel title.<\/p>\n<h3 id=\"e2e3\">Time After\u00a0Time<\/h3>\n<p id=\"07cb\">Recurrent Nets look a lot like feed forward neural nets, except they also have connections that point\u00a0<em>backwards<\/em>. If we look at a simple single neuron RNN, we can see that it receives inputs X at a particular point in time, which we call a \u201cframe\u201d as well as an output from the previous step Y(t-1).<\/p>\n<figure id=\"db66\"><canvas width=\"75\" height=\"63\"><\/canvas><img decoding=\"async\" style=\"width: 700px; height: 607px;\" src=\"https:\/\/cdn-images-1.medium.com\/max\/720\/1*nLevY4zRetjYvePmPmbj-Q.jpeg\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/720\/1*nLevY4zRetjYvePmPmbj-Q.jpeg\" \/><\/figure>\n<p id=\"983a\">The network is really a series of steps in time. It\u2019s no coincidence that each step is called a frame, because it\u2019s like the frame in a film. We \u201cunroll the network\u201d through time, just as we play a movie on the silver screen.<\/p>\n<figure id=\"ac50\"><canvas width=\"75\" height=\"33\"><\/canvas><img decoding=\"async\" style=\"width: 700px; height: 321px;\" src=\"https:\/\/cdn-images-1.medium.com\/max\/720\/1*07P_Ocfuo81W4Us4XFcYtg.jpeg\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/720\/1*07P_Ocfuo81W4Us4XFcYtg.jpeg\" \/><\/figure>\n<p id=\"ce51\">Time is represented by \u201ct\u201d. The current moment in time is just \u201ct\u201d which we see in the middle frame. The previous step is \u201ct-1\u201d (on the left) and one step into the future is \u201ct+1\u201d (the right). The s in the middle is the hidden state, hence the \u201cs\u201d for state. It is the memory of the cell. In pure feed-forward networks the inputs are just\u00a0<em>the weighted outputs of previous nodes<\/em>. In a RNN, this also includes the weighted outputs\u00a0<em>from a previous time step<\/em>. In other words, like we said earlier, it can\u00a0<em>look back in time\u00a0<\/em>and it can attempt to predict a future step<em>.<\/em><\/p>\n<h3 id=\"41d4\">Developing Long Term\u00a0Memory<\/h3>\n<p id=\"75e6\">Basic RNNs have a few challenges. One of the main challenges with plain old RNN\u2019s is that if the network is too deep it can easily begin to \u201cforget\u201d information from earlier parts of the time sequence.<\/p>\n<p id=\"bae4\">Why is that an issue?<\/p>\n<p id=\"3786\">Well let\u2019s pretend you have a RNN doing sentiment analysis of news about stocks, looking to generate buy or sell signals based on whether the public is bullish or bearish on a stock. A stock blogger may start off telling you to sell in the first sentence and then spend the rest of the article lauding the future buy potential of the stock once it has a few weeks to recover from whatever news is damaging the stock today. The system may forget the \u201csell\u201d part and declare it a strong \u201cbuy\u201d based on the positive sentiment later in the story.<\/p>\n<p id=\"ca2c\">They can also have difficulties learning long range dependencies even over shorter sequences. That becomes a serious problem in NLP because the meaning of a sentence isn\u2019t always clustered closely together.<\/p>\n<p id=\"db20\">Most people don\u2019t produce sentences that would make their grade school grammar teacher proud. Instead they scatter the meaning all over the sentence. They use screwy grammar and slang. For humans this is no problem. We have the remarkable ability to understand sentences that are all jacked up. Misplaced modifiers, missing words, typos, and dangling participles won\u2019t slow us down but they can really trip up machines.<\/p>\n<p id=\"1b0d\">For example, if I say \u201cThe man in the blue blazer and white cap played a brilliant jazz solo\u201d, the point of the sentence is not what the man is wearing, which is close to the subject of the sentence but that he played a brilliant jazz solo. If the system forgets that information by the time it gets to the music it missed the point.<\/p>\n<p id=\"799b\">This is what\u2019s know as\u00a0<a href=\"https:\/\/cs224d.stanford.edu\/notebooks\/vanishing_grad_example.html\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/cs224d.stanford.edu\/notebooks\/vanishing_grad_example.html\" data->the vanishing gradients problem which Stanford\u2019s awesome deep learning and NLP class goes into at length<\/a>.<\/p>\n<p id=\"4068\">But I\u2019ll save you a lot of reading and give you a quick summary here. To understand vanishing gradients you need to understand a bit about backpropagation. In\u00a0<a href=\"https:\/\/hackernoon.com\/learning-ai-if-you-suck-at-math-p5-deep-learning-and-convolutional-neural-nets-in-plain-english-cda79679bbe3#.viug9k8sh\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/hackernoon.com\/learning-ai-if-you-suck-at-math-p5-deep-learning-and-convolutional-neural-nets-in-plain-english-cda79679bbe3#.viug9k8sh\" data->part five<\/a>\u00a0we talked about how backpropagation looked to minimize errors by working to\u00a0<em>the lowest point on the error landscape<\/em>. That helps the neural network adjust its weights so that it can go to the next epoch of training. It looks like this in 3D:<\/p>\n<figure id=\"e947\"><canvas width=\"75\" height=\"66\"><\/canvas><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/720\/1*MTG0AXCRlRXeN7i-EBrDUw.png\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/720\/1*MTG0AXCRlRXeN7i-EBrDUw.png\" \/><\/figure>\n<p id=\"6997\">In some ways it is easier to understand in 2D though, so let\u2019s see that:<\/p>\n<figure id=\"9a42\"><canvas width=\"75\" height=\"58\"><\/canvas><img decoding=\"async\" style=\"width: 700px; height: 562px;\" src=\"https:\/\/cdn-images-1.medium.com\/max\/720\/1*T6jzGH7BgHGoae8KaVcJCQ.jpeg\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/720\/1*T6jzGH7BgHGoae8KaVcJCQ.jpeg\" \/><\/figure>\n<p id=\"7428\">The system is taking tiny steps, as it tries to work its way to the bottom of the curve. Now, that\u2019s all well and good when you have a clean error landscape with a nice well-defined curve. But what if the curve flattens out badly? Let\u2019s take a look.<\/p>\n<figure id=\"89e5\"><canvas width=\"75\" height=\"58\"><\/canvas><img decoding=\"async\" style=\"width: 639px; height: 497px;\" src=\"https:\/\/cdn-images-1.medium.com\/max\/720\/1*evNDe-0l7WKGPeU14NeEgA.jpeg\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/720\/1*evNDe-0l7WKGPeU14NeEgA.jpeg\" \/><\/figure>\n<p style=\"text-align: center;\">Courtesy of the Stanford Deep NLP\u00a0course<\/p>\n<p id=\"3c8a\">When the line flattens out we call the neurons \u201csaturated.\u201d Instead of activating and finding useful data, they are effectively dead. Even worse, they have an exponentially bad effect on previous neurons. Remember that neural networks are matrices, which are really just spreadsheets on steroids. One cell is added or multiplied to the next cell in a long chain of equations.<\/p>\n<figure id=\"2e37\" data-scroll=\"native\"><canvas width=\"57\" height=\"75\"><\/canvas><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/540\/1*i8_yJmVmt2NYjmkFFOsfZg.jpeg\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/540\/1*i8_yJmVmt2NYjmkFFOsfZg.jpeg\" \/><\/figure>\n<p id=\"ae0c\">By the way, if you\u2019re still struggling with matrix math, I am loving this anime book called\u00a0<a href=\"http:\/\/amzn.to\/2or8aoA\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/amzn.to\/2or8aoA\" data->The Manga Guide to Linear Algebra<\/a>.<\/p>\n<p id=\"3f68\">The Japanese just have better teaching tools. If I had this book in school I might have enjoyed it a lot more.<\/p>\n<p id=\"1c7e\">Back to the math!<\/p>\n<p id=\"84c1\">Remember our dot product image from\u00a0<a href=\"https:\/\/hackernoon.com\/learning-ai-if-you-suck-at-math-p6-math-notation-made-easy-1277d76a1fe5\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/hackernoon.com\/learning-ai-if-you-suck-at-math-p6-math-notation-made-easy-1277d76a1fe5\" data->part six on math notation<\/a>?<\/p>\n<figure id=\"4f9f\"><canvas width=\"75\" height=\"30\"><\/canvas><img decoding=\"async\" style=\"width: 700px; height: 285px;\" src=\"https:\/\/cdn-images-1.medium.com\/max\/720\/1*nylC61biC9qoSjBHftY41A.jpeg\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/720\/1*nylC61biC9qoSjBHftY41A.jpeg\" \/><\/figure>\n<p id=\"970b\">Now imagine all of those numbers are zero or almost zero. What happens to the chain of calculations?<\/p>\n<p id=\"d281\">When a number of neurons have small numbers as their value, the multiplication causes the\u00a0<em>gradient values to shrink exponentially fast, which quickly drives all the neurons in the chain towards zero<\/em>. This means they\u2019re\u00a0<strong><em>effectively turned off and doing nothing<\/em><\/strong>. They\u2019re like dead pixels on a TV screen, no longer useful. The deeper the network the worse this problem gets.<\/p>\n<p id=\"d0f2\">A number of solutions to this problem cropped up over the years. The first was to use the RELU activation function instead of the Tanh or Sigmoid activation functions.<\/p>\n<p id=\"d9be\">Why do that? Well you just have to look at a Sigmoid curve to understand.<\/p>\n<figure id=\"eb2b\"><canvas width=\"75\" height=\"63\"><\/canvas><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/720\/1*PdvlQrJDQ-CYhfImXovDUA.png\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/720\/1*PdvlQrJDQ-CYhfImXovDUA.png\" \/><\/figure>\n<p id=\"a539\">Notice how it has that nice curved edge at the bottom and top? We want curves like that when we\u2019re drawing a face or the arch of a bridge, but that bottom curve is the slope of despair when it comes to vanishing gradients.<\/p>\n<p id=\"bdac\">Now look at a RELU vs Sigmoid visualization:<\/p>\n<figure id=\"5d33\"><canvas width=\"75\" height=\"25\"><\/canvas><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/720\/1*HTBuShCMyQWzWb1TuPtQjg.jpeg\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/720\/1*HTBuShCMyQWzWb1TuPtQjg.jpeg\" \/><\/figure>\n<p id=\"9878\">Notice that hard angle! The RELU function delivers a constant of 0 or 1 and as you can see it has a hard shape with no soft slope at the edges, so it isn\u2019t as likely to hit that vanishing problem.<\/p>\n<p id=\"0459\">But there\u2019s a better solution. Let\u2019s check that out.<\/p>\n<h3 id=\"2125\">Enter the\u00a0Dragon<\/h3>\n<p id=\"c926\">The real answer to the question of vanishing gradients is not to change the activations on a regular RNN.<\/p>\n<p id=\"27c0\"><strong>It\u2019s to switch to the more popular\u00a0<\/strong><a href=\"http:\/\/bioinf.jku.at\/publications\/older\/2604.pdf\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/bioinf.jku.at\/publications\/older\/2604.pdf\" data-><strong>Long Short Term Memory (LSTM), first outlined in 1997<\/strong><\/a><strong>, or\u00a0<\/strong><a href=\"https:\/\/arxiv.org\/pdf\/1412.3555.pdf\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/arxiv.org\/pdf\/1412.3555.pdf\" data-><strong>Gated Recurrent Networks (GRUs), outlined in 2014<\/strong><\/a><strong>.<\/strong><\/p>\n<p id=\"f8c7\">Both of these architectures were designed with vanishing gradients in mind. They were also meant to look for long range dependencies. In practice regular RNNs are rarely used anymore, while GRUs and LSTMs dominate the field.<\/p>\n<p id=\"8f6a\">The name LSTM might seem strange at first but not when you consider what the network is doing. In essence an LSTM is a black box memory cell that looks like a standard RNN memory cell but in reality it holds dual states in two vectors, a long term state and a short term state.<\/p>\n<p id=\"20f2\"><a href=\"https:\/\/deeplearning4j.org\/lstm.html\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/deeplearning4j.org\/lstm.html\" data->Here is an illustration of an LSTM from my friends over at the Deep Learning 4 J team<\/a>.<\/p>\n<figure id=\"231d\"><canvas width=\"75\" height=\"36\"><\/canvas><img decoding=\"async\" style=\"width: 700px; height: 349px;\" src=\"https:\/\/cdn-images-1.medium.com\/max\/720\/1*JgLsDjE0uNtt09ypywZyOQ.png\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/720\/1*JgLsDjE0uNtt09ypywZyOQ.png\" \/><\/figure>\n<p id=\"95c2\">By the way, one of the absolute best books I\u2019ve read on this topic (and neural nets\/deep learning in general) is the just released\u00a0<a href=\"http:\/\/amzn.to\/2p5t4Ll\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/amzn.to\/2p5t4Ll\" data-><strong>Hands-On Machine Learning with Scikit-Learn and Tensorflow<\/strong><\/a>. I saw a few earlier editions and they really upped my game. Don\u2019t wait, just grab it ASAP. It rocks. It goes into a ton more detail than I have here but I\u2019ll give you the basics to get you moving in the right direction fast.<\/p>\n<figure id=\"768f\" data-scroll=\"native\"><canvas width=\"57\" height=\"75\"><\/canvas><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/540\/1*HcJYfBrjz7JNfvwIZgxWRQ.jpeg\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/540\/1*HcJYfBrjz7JNfvwIZgxWRQ.jpeg\" \/><\/figure>\n<p id=\"9bc5\">You can see that information travels along two lines through a series of \u201cgates.\u201d The top line is called the \u201cforget line.\u201d This is a pretty piss poor term, in my humble opinion, but I didn\u2019t name it, so don\u2019t blame me. Let\u2019s just go with it.<\/p>\n<p id=\"3a75\"><strong>The \u201cforget line\u201d remembers the long term state.<\/strong><\/p>\n<p id=\"30b9\">It gets\u00a0<em>copied forward<\/em>\u00a0into\u00a0<em>new cells<\/em>\u00a0as the\u00a0<em>network unrolls<\/em>. Actually, it\u2019s not a completely ridiculous name.<\/p>\n<p id=\"c318\">It\u2019s called the forget line because it does loose bits of information as it goes.<\/p>\n<p id=\"6cfb\"><strong>The other lines contain short term associations and memories, which are then incorporated into the \u201cforget\u201d line.<\/strong><\/p>\n<p id=\"939e\">At each time step some memories go out the window and some get added.<\/p>\n<p id=\"851b\">GRU\u2019s are basically a simplified form of LSTMs. Check out\u00a0<a href=\"http:\/\/kvitajakub.github.io\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/kvitajakub.github.io\/\" data->this awesome diagram of a GRU from Jacob Kvita, a comp-sci student and former Red Hatter, who made them for his thesis.<\/a><\/p>\n<figure id=\"c6a4\"><canvas width=\"75\" height=\"44\"><\/canvas><img decoding=\"async\" style=\"width: 700px; height: 431px;\" src=\"https:\/\/cdn-images-1.medium.com\/max\/720\/1*EhW4AkiHv11enr_PS0K2fw.png\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/720\/1*EhW4AkiHv11enr_PS0K2fw.png\" \/><\/figure>\n<p id=\"c5fb\">What\u2019s the difference?<\/p>\n<p id=\"9ab5\"><strong>The GRU cell merges the long and short term memory into\u00a0<em>a single vector.<\/em><\/strong><\/p>\n<p id=\"55fe\">Why do that? Simple. Performance. It\u2019s less computationally expensive and<em>y<\/em>et somehow seems to perform as well. That\u2019s a win!<\/p>\n<p id=\"d539\">It also uses only a single \u201cgate\u201d for both the short and long term memory. Lastly, it adds a new kind of gate that decides what to show to the next layer.<\/p>\n<h3 id=\"3c93\"><strong>Monkeys in the\u00a0Machine<\/strong><\/h3>\n<p id=\"d749\">OK. All that\u2019s great, Dan, but how do I generate text from that?<\/p>\n<p id=\"c67d\">Good question.<\/p>\n<p id=\"073d\">Karpathy\u2019s post demonstrates a \u201ccharacter\u201d level RNN. A character level model looks to understand language on a character by character basis.<\/p>\n<p id=\"71ce\">How does it do that?<\/p>\n<p id=\"7a97\">All neural networks are essentially complicated\u00a0<em>prediction engines<\/em>. So we feed the system millions words and it stores those words as sequences of characters. Then it begins to predict what the next character is likely to be. Once it\u2019s learned what to predict we can then have the system pull tricks for us like generate sample text based on feeding it a \u201cseed\u201d set of words. That\u2019s all theoretical, so let\u2019s look at a simple example.<\/p>\n<p id=\"b2c0\">First, let\u2019s pretend that the system has only learned a few words:<\/p>\n<ul>\n<li id=\"8d7a\">hey<\/li>\n<li id=\"a63a\">hello<\/li>\n<li id=\"34d5\">help<\/li>\n<li id=\"c27a\">I<\/li>\n<li id=\"0b8f\">there<\/li>\n<li id=\"f8c5\">need<\/li>\n<\/ul>\n<p id=\"ada8\">We also teach it a few punctuation marks like \u201c.\u201d and \u201c!\u201d<\/p>\n<p id=\"5cf4\">Remember though that our simple RNN hasn\u2019t learned complete words. It\u2019s only learned a series of characters, so instead of understanding \u201chello\u201d as an entire self contained entity, it knows h-e-l-l-o. In knows that \u201ce\u201d follows \u201ch\u201d and so on.<\/p>\n<p id=\"ec4b\">Now imagine that I show the system a million variants of sentences that I can construct from the few vocabulary words that I\u2019ve taught the machine. Those sentences might be something like:<\/p>\n<ul>\n<li id=\"f078\">Hey, there. Hello!<\/li>\n<li id=\"9f00\">Hello! Help!<\/li>\n<li id=\"910d\">Help!<\/li>\n<li id=\"d1c1\">Help.<\/li>\n<\/ul>\n<p id=\"8084\">I then seed the engine with the phrase:<\/p>\n<ul>\n<li id=\"a12a\">\u201cI need he\u201d<\/li>\n<\/ul>\n<p id=\"065e\">Notice that I didn\u2019t write the complete word that I want it to guess.<\/p>\n<p id=\"0025\">The system would then look inside its black box and try to predict the next likely character. In this case it could be either \u201cl\u201d as in \u201chello\u201d or it could be \u201cy\u201d as in \u201chey\u201d or it could be \u201cl\u201d as in \u201chelp.\u201d<\/p>\n<p id=\"312d\">If the network is properly trained we hope it chooses \u201cl\u201d and eventually \u201cp\u201d for \u201chelp\u201d because that\u2019s one of the few constructions that make sense.<\/p>\n<p id=\"ff15\"><a href=\"https:\/\/github.com\/fchollet\/keras\/blob\/master\/examples\/lstm_text_generation.py\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/github.com\/fchollet\/keras\/blob\/master\/examples\/lstm_text_generation.py\" data-><strong>We can find a character level RNN implementation in the Keras examples Github<\/strong><\/a><strong>.<\/strong><\/p>\n<p id=\"1ab1\">It\u2019s trained on a corpus of Nietzsche with about 100,000 words. The example recommends that we use at least a million words to make the system more robust.<\/p>\n<p id=\"d7e5\"><a href=\"https:\/\/github.com\/the-laughing-monkey\/learning-ai-if-you-suck-at-math\/blob\/master\/Deep%20Learning%20Examples\/GreatBookTitles.txt\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/github.com\/the-laughing-monkey\/learning-ai-if-you-suck-at-math\/blob\/master\/Deep%20Learning%20Examples\/GreatBookTitles.txt\" data->I decided to feed my character level RNN a dataset that I created by hand over a few days<\/a>\u00a0which you can find on my Github. I typed in every single great novel title by combing through my memory, bookshelf, and numerous top 100 lists.<\/p>\n<p id=\"88b7\">Unfortunately, I quickly ran out of great book titles.<\/p>\n<p id=\"c272\">One option would be to simply feed it as many titles as I could find by downloading library catalogs, but I wanted to focus on titles that really stood out and not clog it up with any old crap. To augment it I went on to great movie titles, then great songs and band names.<\/p>\n<p id=\"e85c\">Still when I was done, I was left with a mere 26K worth of words, which made the system particularly unreliable. But I decided to give it a go anyway. So how did it do? Here are few results.<\/p>\n<pre id=\"5629\">tha ect are dog<\/pre>\n<pre id=\"fc89\">a9t byta go than<\/pre>\n<pre id=\"b33c\">wel pt year benc<\/pre>\n<p id=\"b348\">Hardly magical.<\/p>\n<p id=\"2f10\">Even after training the system for many, many, many epochs it still mostly sucked. I ran the system for 7000 iterations overnight. It still produced garbage.<\/p>\n<p id=\"3a95\">At this point I couldn\u2019t tell whether it was just the tiny dataset that I gave it or the RNN itself. Rather than brute force tweak the system, I decided to see if I could find an answer to that question before spending five nights tuning the system to no end. As I puzzled over why it failed, I turned back to Karpathy\u2019s blog and found a potential answer.<\/p>\n<p id=\"cc7b\">Karpathy trained his character level generator on Shakespeare, with significantly more text for the machine to eat up. Here is an example from his post:<\/p>\n<div id=\"6e89\"><span style=\"font-family: courier new,courier,monospace;\"><code>\u201cPANDARUS: Alas, I think he shall be come approached and the day When little srain would be attain'd <\/code><\/span><\/div>\n<div><span style=\"font-family: courier new,courier,monospace;\"><code>into being never fed, And who is but a chain and subjects of his death, I should not sleep.<\/code><\/span><\/div>\n<div><\/div>\n<div><span style=\"font-family: courier new,courier,monospace;\"><code>Second Senator: They are away this miseries, produced upon my soul, Breaking and strongly should be <\/code><\/span><\/div>\n<div><span style=\"font-family: courier new,courier,monospace;\"><code>buried, when I perish The earth and thoughts of many states.\u201d<\/code><\/span><\/div>\n<p>&nbsp;<\/p>\n<p id=\"e419\">He\u2019s particularly excited that the system seems to be generating text that\u00a0<em>looks like Shakespeare<\/em>, at least a first glance.<\/p>\n<p id=\"9c3d\">It is formatted like a play. There is dialogue. There are character names. It even has a little flavor of the Bard with words like \u201calas.\u201d<\/p>\n<p id=\"d9a7\">In some respects this is truly amazing. Remember that the system doesn\u2019t know anything about English. It has no context. It has no knowledge of verbs or characters or dialogue at all. It learned that through grokking the patterns and outputting a similar pattern.<\/p>\n<p id=\"a753\">However, as a writer, I found myself less enamored with this output than Karpathy.<\/p>\n<p id=\"4474\">While it\u2019s true that the system aped the basic formatting of a play, I don\u2019t see this as much of a feat. We had dumb systems capable of auto-formatting a play in 1980\u2019s for screenwriters. The biggest thing I notice is that the system produced gibberish that\u2019s formatted nicely, but that means absolutely nothing. It produces words, but the words put together add up to zilch. The sentences mean nada.<\/p>\n<p id=\"3b46\">Basically, it detected a pattern but not a very useful one. I wanted it to learn poetry and it learned how to act like a sort of smart version of\u00a0<a href=\"https:\/\/www.writersstore.com\/movie-magic-screenwriter-screenwriting-software\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.writersstore.com\/movie-magic-screenwriter-screenwriting-software\/\" data->Screenwriter Pro<\/a>.<\/p>\n<p id=\"9d28\">But I didn\u2019t lose hope!<\/p>\n<p id=\"e26e\">Intuitively, I recognized that it makes little sense to try to train these systems at the character level.<\/p>\n<p id=\"6547\">Why make the system work so hard to try to predict what the next character should be so as to form some semblance of words?<\/p>\n<p id=\"f758\">Notice that even in the Shakespeare output it sometimes produced nonsense words like \u201csrain\u201d which means that even after hours of training it was still struggling to avoid kindergarten level mistakes. I wondered if researchers realized, like I did, that it made more sense to train the system at the \u201cword\u201d level or even the \u201csentence\u201d level. In other words, instead of studying \u201ch-e-l-l-o\u201d train it on \u201chello\u201d.<\/p>\n<p id=\"1bc9\">Turns out they did.<\/p>\n<p id=\"3b34\">I discovered\u00a0<a href=\"https:\/\/github.com\/hunkim\/word-rnn-tensorflow\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/github.com\/hunkim\/word-rnn-tensorflow\" data->this modification of the basic character RNN, turning it into a word level monster<\/a>. This system also introduced the more advanced concepts of LSTM and GRUs. Awesome! Now the system can learn whole words, instead of letters.<\/p>\n<p id=\"05c3\">And in a testament to just how fast this field is developing, there are dozens of word level RNN systems, since I started this article three months ago. I had some other work take precedence, and some of the earlier Learning AI articles seemed to fit better if they came first, so I set this article aside. Now I come back to find a\u00a0<a href=\"https:\/\/github.com\/vlraik\/word-level-rnn-keras\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/github.com\/vlraik\/word-level-rnn-keras\" data->number of different versions<\/a>\u00a0of\u00a0word level RNNs\u00a0for\u00a0<a href=\"https:\/\/github.com\/vlraik\/word-level-rnn-keras\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/github.com\/vlraik\/word-level-rnn-keras\" data->you to play with<\/a>\u00a0and\u00a0<a href=\"https:\/\/github.com\/pytorch\/examples\/tree\/master\/word_language_model\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/github.com\/pytorch\/examples\/tree\/master\/word_language_model\" data->test out<\/a>. Sweet!<\/p>\n<p id=\"16af\">So did it work? Here\u2019s some output after training the system for thousands of epochs, again remembering that my dataset is far from the ideal size.<\/p>\n<div id=\"a520\"><span style=\"font-family: courier new,courier,monospace;\">Play Go The Wide Virgin Me is Teen Scream I, Masque and a Champions The For Is with Myself Tears, the Tropic of the Looking Ugly The Journey of to Big Empire The Red What Adventures The Naked Nails Dirty What&#8217;s West Twenty Mask in the End of Earth As Dance to the Atlantis Was Be If Even In Me Paradiso Crime Smokestack Mojo Jest The Carpenter The Nightmare of Heights The Golden Twenty House So 1\/2 Hand in the Drugs were God The Snows and the Rain Cat Things We Thank My Knew L.A. Did Deep The Goblet in Steal: The an These Along the Bonfire The End of Quarter Halloween Madonna Mote Killshot Way of the River Torturer The Inc. Rex The Anvil of Imagination were Sabbath Wild Morning Angry Mice The Thin Street Tangled Got In Want Pretty a Turning of the Beethoven not Salem&#8217;s Atuan Break, Lost Red Charlotte&#8217;s Drummer Giving Ship A Susie On Mars The Night Don&#8217;t Still Crash Spy In the Ritz The Goblet of Heaven The Cure Good Cosmos The Time&#8217;s Brigade this Dreams Can&#8217;t Folsom Dove You Jumping Hide Come is a City Wars in the Taming In Like for the Mind<\/span><\/div>\n<div><\/div>\n<div><span style=\"font-family: courier new,courier,monospace;\">All Above Terra Doom Things Rehab Exit You Lays Heat The Devil Outrageous Cry Clash Place The Ashes Men Side The Toyshop The Velvet in the Red A Road Without Little Red Of Door Comedy <\/span><span style=\"font-family: courier new,courier,monospace;\">Undery<\/span><span style=\"font-family: courier new,courier,monospace;\"> Me a Gods The Eden and the Black Badge In Stop the Wall and the Night 96 Captain! Street to Time on the Earth of Bees Steel Why to Empty Got I Want Myself Rolling Iron in Everything Songs Oh, Be <\/span><span style=\"font-family: courier new,courier,monospace;\">nd<\/span><span style=\"font-family: courier new,courier,monospace;\"> Folsom A Grifters The Game The Secret Fountainhead The River Nine Germs Nights for Me Are Know You Wear Miles in on Stuff Up Vanity Sleep The Clash A Empire A Lost in a Sex Machine Wake Dazed What Steel Steal: for Chocolate Secret Planet Moment Purple Red Snow Some Are Dark, Me You a Row Suspicious Detective Surrender Will Hound Delicatessen None The Cathedral of Empires What Mary Going Big Whom need by this The Dancer Up Summer Nine Kill Night Fight Dog Cross and the Bob World California I 101 Suede Drummer Book Pyscho Prophet Eye of the River Men Man I War Be Eyed Be Video Dream See Samurai The Widening Baby The Standing Express Untrodden The Man of the<\/span><\/div>\n<div><\/div>\n<p id=\"c7a7\">It outputs a giant block of text that is a little hard to deal with, so I wrote a little script to slice it up into 2 to 7 word sentences, which is about the length of a good title. Most good titles actually live in the four word range.<\/p>\n<p id=\"ab72\">That gave me some good results that I saved to a file, discarding the obvious gibberish ones.<\/p>\n<div id=\"ea69\"><span style=\"font-family: courier new,courier,monospace;\">Sleepless in Cryptonomicon<\/span><\/div>\n<div><\/div>\n<div><span style=\"font-family: courier new,courier,monospace;\">The Sun Rope<br \/>\nDelicatessen in the Jungle<br \/>\nDaisy the Cloudy Shoplifters<br \/>\nWaiting for a Glass Full<br \/>\nBlood Agency<br \/>\nThe China Proposal<br \/>\nBeloved Mayor of Horton<br \/>\nWalking China<br \/>\nThe Metropolis Jacket<br \/>\nThe Steel Beowulf<br \/>\nMagnolias Dawn, Little Prarie Sun<br \/>\nFried Castle Blind<br \/>\nSense of Disobedience<br \/>\nThe Meatballs Dune<br \/>\nChina Hooker Tomatoes<br \/>\nOf Slave Blood<br \/>\nIn the Usual House<br \/>\nTrial Fried Castle<\/span><\/div>\n<div><\/div>\n<div><span style=\"font-family: courier new,courier,monospace;\">Why Eternity Glass<br \/>\nThe Lovely Wide Evil<br \/>\nThe Bright Gene<br \/>\nThe Infinity Half<br \/>\nThe Lathe of Dr Dispossessed<br \/>\nTo Murder Proud<br \/>\nThe Sick Archbishop<br \/>\nGun Man Blue<br \/>\nIn the Silence<br \/>\nThe Radio Who Dragons Through<br \/>\nGlory of the Dead<br \/>\nA Golden Geisha<br \/>\nThe Sand Woods<br \/>\nGates of Cholera<br \/>\nA Right Good Dawn<br \/>\nA Rosetta Ruby<br \/>\nNew Tide Sky<br \/>\nThe Fire Plan<br \/>\nMan to Barbarism<br \/>\nThe Deception Needle<\/span><\/div>\n<div><\/div>\n<div><span style=\"font-family: courier new,courier,monospace;\">The River Break<br \/>\nThe Secret Electric Manifesto<br \/>\nCity of Lost Faces<br \/>\nJude the Key<br \/>\nMystic Germs<br \/>\nThe Roman Woods<br \/>\nGold Sweet Death<br \/>\nThe Brand Morgue<br \/>\nSweet Dreams Piano<br \/>\nLoving Shanghai<br \/>\nEnd of Lolita Childhood<br \/>\nCold Geisha<br \/>\nThe Last Baby<br \/>\nGood Journey into the Light<br \/>\nThe Door Song<br \/>\nSong for Want<br \/>\nThe Bitter Lady<\/span><\/div>\n<div><\/div>\n<div><span style=\"font-family: courier new,courier,monospace;\">I, Samurai<br \/>\nIn Me, Not Get Proud<br \/>\nMystic Sex<br \/>\nThe Death of Walter<br \/>\nStop Heaven&#8217;s Sun<br \/>\nOne Mystic Cannibal<br \/>\nthe Cannibal&#8217;s Candle<br \/>\nThe Secret Red Sky<br \/>\nPeople of the Fire<br \/>\nStardust Winter&#8217;s Love<br \/>\nJohnny Never Gonna Stop<br \/>\nGone Thunder Rolling<br \/>\nThe Metamorphosis Fish<br \/>\nSnowy Spots the Rainbow<br \/>\nThe Tabloid Bums<\/span><\/div>\n<div><\/div>\n<div><span style=\"font-family: courier new,courier,monospace;\">The Invisible Deep<br \/>\nthe Deep and Unbearable<br \/>\nCall of Fire<br \/>\nThe Cuckoo&#8217;s Jekyll<br \/>\nThe Red Tenderness<br \/>\nThe Raven&#8217;s School<br \/>\nThe Memories of God<br \/>\nThe Cave Dragon<br \/>\nJim the Savage<\/span><\/div>\n<div><\/div>\n<div><span style=\"font-family: courier new,courier,monospace;\">Sunset Now on Brooklyn<br \/>\nBlack Song<br \/>\nI Was Toys<br \/>\nThe Snows Creek Came<br \/>\nThe Secret Land<br \/>\nThe Well<br \/>\nThe Last Lies<br \/>\nLords of the Knife<br \/>\nInside Physics<br \/>\nThe Galaxy of Gone<br \/>\nThe Satanic Playlist<br \/>\nThe Bloody 9<\/span><br \/>\n<span style=\"font-family: courier new,courier,monospace;\">Freakonmics<\/span><span style=\"font-family: courier new,courier,monospace;\">: A Hard Black Dance<\/span><\/div>\n<div><\/div>\n<div><span style=\"font-family: courier new,courier,monospace;\">Stone of Fire<br \/>\nA Road Death<br \/>\nThe Feast Baby<br \/>\nLucifer&#8217;s Rainbow<br \/>\nA Severed Cage<br \/>\nOf Summertime Glass<br \/>\nLucky Break in the Night<br \/>\nthe knife Man<br \/>\nPrison Rain<br \/>\nThe Door to the Cosmos<br \/>\nSolitude in the Frost<br \/>\nThe Clockwork Chamber<br \/>\nThe Black Queen<br \/>\nBack to the Wind<br \/>\nThe Blind Fields<br \/>\nMarathon of Fear<br \/>\nSophie&#8217;s Dragons<br \/>\nThe First New Madre Soldier<\/span><\/div>\n<div><\/div>\n<div><span style=\"font-family: courier new,courier,monospace;\">Jurassic Magnolias<br \/>\nSeattle Siddhartha<br \/>\nThe Glass Dawn<br \/>\nThe Beloved Metropolis<br \/>\nThe Glass Temple<br \/>\nSteel Woods<br \/>\nThe House of Inception<br \/>\nThe Tao of the Third<br \/>\nLonesome Winter&#8217;s Man<br \/>\nSugar Acid<br \/>\nThe Piano Ashes<br \/>\nThe Anarchist&#8217;s Game<br \/>\nThe Furious Tenderness<br \/>\nThe Red Hallows<br \/>\nParadise Demons<br \/>\nDemons of Time<br \/>\nCosmos, I Ride<br \/>\nThe Machine King<br \/>\nThe King&#8217;s Blue Grass<br \/>\nThe End of Kashmir<br \/>\nThe Secret Soldier<br \/>\nLove of Sunshine<br \/>\nThe Night of the Rose<br \/>\nTea House Cowgirls<\/span><\/div>\n<div><\/div>\n<div><span style=\"font-family: courier new,courier,monospace;\">The Vishnu Indigo<br \/>\nDeath of the Stars<br \/>\nIn the Red Morning<br \/>\nThe Star Queen&#8217;s Face<br \/>\nRiver Demons<br \/>\nThe Night Runner<\/span><\/div>\n<div><\/div>\n<div><span style=\"font-family: courier new,courier,monospace;\">The Charge of Fire<br \/>\nThe World of Chocolate Songs<br \/>\nA Purloined Cloud<br \/>\nThe Art of Hanging<br \/>\nOde to the Sleepers<br \/>\nThe Gold Inside<br \/>\nEven the Asphalt<br \/>\nRogue Funeral<br \/>\nSea of the Red God<\/span><\/div>\n<div><\/div>\n<p id=\"76ee\">Some of those are not bad! As a friend said, it swerves from the banal to the brilliant. There are some awesome ones, like:<\/p>\n<ul>\n<li id=\"d04b\">The Art of Hanging<\/li>\n<li id=\"2079\">Lucifer\u2019s Rainbow<\/li>\n<li id=\"532e\">Sea of the Red God<\/li>\n<li id=\"71bc\">River Demons.<\/li>\n<li id=\"f732\">The Invisible Deep<\/li>\n<li id=\"fba0\">Black Song<\/li>\n<li id=\"23af\">The Memories of God<\/li>\n<\/ul>\n<p id=\"41cf\">There is also some comedy gold like \u201cChina Hooker Tomatoes\u201d!<\/p>\n<figure id=\"4c85\" data-scroll=\"native\"><canvas width=\"51\" height=\"75\"><\/canvas><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/540\/1*7yQD0XzI0lv4Fl-Fs6vPHA.jpeg\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/540\/1*7yQD0XzI0lv4Fl-Fs6vPHA.jpeg\" \/><\/figure>\n<p id=\"a4e9\">You can also see it\u2019s not a great idea to include\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Portmanteau\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/en.wikipedia.org\/wiki\/Portmanteau\" data->portmanteau<\/a>words like \u201cFreakonomics\u201d as that is clearly someone else\u2019s famous title, so whenever it shows up you\u2019ll be breaking copyright if you decided to actually name a book that way. Better to use generic words. Although I quite like \u201cFreakonomics: A Hard Black Dance\u201d, it might make a good followup to\u00a0<a href=\"http:\/\/amzn.to\/2pcs7Rl\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/amzn.to\/2pcs7Rl\" data->the classic origina<\/a>l by Steven D. Levitt.<\/p>\n<p id=\"e21d\">I had even higher hopes for a sentence level RNN but unfortunately I wasn\u2019t able to find a decent set of working code to test out when I was testing this a few months ago. I\u00a0<a href=\"https:\/\/arxiv.org\/pdf\/1703.07713.pdf\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/arxiv.org\/pdf\/1703.07713.pdf\" data->found several papers out of China<\/a>\u00a0that looked promising. Then, of course, in my hiatus, someone went ahead and\u00a0<a href=\"https:\/\/richliao.github.io\/supervised\/classification\/2016\/12\/26\/textclassifier-RNN\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/richliao.github.io\/supervised\/classification\/2016\/12\/26\/textclassifier-RNN\/\" data->did a kickass blog post on sentence level classification with code!<\/a>\u00a0I will likely do a followup after I test some of the new choices out there.<\/p>\n<h3 id=\"46ca\"><strong>NLP and\u00a0Beyond<\/strong><\/h3>\n<p id=\"b475\">That said, I am not sure that what these systems produce is really heads and tails above random word generators. It\u2019s pretty good, but if you look hard enough you recognize it\u2019s basically a semi-random mashup of already good titles.<\/p>\n<p id=\"c5dc\"><strong>If I\u2019m being honest with you, I have to admit I don\u2019t find these types of systems very effective for cranking out Shakespeare and titles, much less \u201cunreasonably\u201d effective.\u00a0<\/strong>This kind of sentence level generator is mostly a parlor trick that obscures what NLP really does well.<\/p>\n<p id=\"c59b\"><strong>It turns out that NLP is much better at more restricted problem sets, like sentiment classification.<\/strong><\/p>\n<p id=\"26de\">In fact, you can get\u00a0<a href=\"https:\/\/www.youtube.com\/watch?v=nfoudtpBV68\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.youtube.com\/watch?v=nfoudtpBV68\" data->a good breakdown of the state of the art from a lecture from the Stanford NLP intro course<\/a>. It\u2019s just slightly out of date in that a few of those problems were solved better in the last few years but it\u2019s still a great intro to the field.<\/p>\n<p id=\"2d67\"><strong>So what\u2019s the State of the Art?\u00a0<\/strong>Here\u2019s a breakdown from the video:<\/p>\n<p id=\"7959\"><strong>Mostly Solved Tasks:<\/strong><\/p>\n<ul>\n<li id=\"f171\">Spam detection<\/li>\n<li id=\"3ecf\">Parts of speech tagging: (adj\/noun\/verb)<\/li>\n<li id=\"b98a\">Named entity recognition<\/li>\n<\/ul>\n<p id=\"eb7a\"><strong>Making good progress:<\/strong><\/p>\n<ul>\n<li id=\"0596\">Sentiment analysis<\/li>\n<li id=\"7ec8\">Coreference resolution<\/li>\n<li id=\"a516\">Word sense disambiguation<\/li>\n<li id=\"86b3\">Machine translation<\/li>\n<\/ul>\n<p id=\"2161\"><strong>Still really hard:<\/strong><\/p>\n<ul>\n<li id=\"f320\">Question answering<\/li>\n<li id=\"3b3b\">Paraphasing<\/li>\n<li id=\"b474\">Summarization<\/li>\n<li id=\"54e1\">Dialog<\/li>\n<\/ul>\n<p id=\"4490\">These systems shine when you go\u00a0<em>with<\/em>\u00a0what they\u2019re good at doing, not\u00a0<em>against\u00a0<\/em>it, as I discovered with my title experiment.<\/p>\n<p id=\"c2ba\"><strong>What do all those tasks have in common?<\/strong><\/p>\n<p id=\"e9e4\"><strong>In essence, these systems are good at predicting the next likely word in a previously understood sequence.\u00a0<\/strong>They can also break down a sentence into its component parts or figure out if a sentence is positive or negative.<\/p>\n<p id=\"a993\">What good is that you wonder?<\/p>\n<p id=\"2a25\">The answer is probably in your pocket. Or you\u2019re staring at the answer if you\u2019re reading this on your phone. I\u2019m talking about the Google Assistant or Siri.<\/p>\n<p id=\"af99\">After training these systems on millions of hours of people talking, these AI assistants can take an audio sample and quickly disambiguate a garbled word by predicting that the most likely next word is \u201chelp\u201d instead of \u201chalter.\u201d In fact, I\u2019m finding the new Pixel phone, which is bundled with the latest Google Assistant to be smashingly good at this kind of task. It rarely predicts the wrong word when I talk to it.<\/p>\n<p id=\"d9c0\">Even better, it seems to understand a lot of semantic context to what I\u2019m asking of it. For example when I say \u201cShow me a bunch of good restaurants nearby\u201d it knows to show\u00a0<em>highly rated<\/em>\u00a0restaurants near me rather than a random selection crappy rated eateries. That\u2019s very, very cool.<\/p>\n<p id=\"2163\">It turns out that what I asked my fledgling AI NLP baby to do is a particularly hard problem that just isn\u2019t solved yet. In hindsight, it\u2019s not hard for me to figure out why as a writer.<\/p>\n<p id=\"57a2\"><strong>While NLP practitioners are focused on decomposing a sentence into its most basic building blocks, a great writer knows that the power and meaning of writing comes from the words working together, not taken in isolation.<\/strong><\/p>\n<p id=\"34dd\">The real patterns I was hoping to detect are much, much different. They\u2019re the stuff of art, such as poetic turns of phrase and unique word combinations. Let\u2019s take a look at a few great titles to see what I mean.<\/p>\n<h3 id=\"5af0\"><strong>The Sound and the Fury of China Hooker\u00a0Tomatoes<\/strong><\/h3>\n<p id=\"683e\">Here\u2019s a famous title from Maya Angelou. It\u2019s one of my favorites:<\/p>\n<p id=\"c24c\">1)\u00a0<strong>I Know Why the Caged Bird Sings<\/strong><\/p>\n<p id=\"4e42\">This is an incredibly advanced title construction that highlights why NLP is so challenging.<\/p>\n<p id=\"0212\">First of all, there are very subtle structural problems for machines here. For example, the title rolls off the tongue but there is no clear reason why. It\u2019s not using any obvious literary techniques, like alliteration, that we can easily point out. If we can\u2019t find it, the machine probably can\u2019t either.<\/p>\n<p id=\"8ec4\">Now, it\u2019s arguably using a technique called\u00a0<a href=\"https:\/\/literarydevices.net\/sibilance\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/literarydevices.net\/sibilance\/\" data->sibilance<\/a>, which is either the reputation of S sounds or just speaking with a low level whispering type cadence, though it\u2019s not using it precisely because there is only one S. Sibilance makes for a sensuous or sinister feeling. Think of a lover whispering in your ear or a snake hissing, both of them using the S to excite or terrorize you.<\/p>\n<p id=\"5976\">Actually if\u00a0<a href=\"https:\/\/www.youtube.com\/watch?v=ePodNjrVSsk\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.youtube.com\/watch?v=ePodNjrVSsk\" data->you\u2019ve ever seen Angelou speak<\/a>\u00a0she uses a great deal of sibilance, so perhaps when I read it, I just hear her voice in my mind? And that brings me to the second major problem for machine interpreters:<\/p>\n<p id=\"d9c3\">An NLP system can only understand meaning from what is\u00a0<em>directly contained in the text itself<\/em>.<\/p>\n<p id=\"c22a\">Unfortunately, for ML gurus communication does not exist in a vacuum.<\/p>\n<p id=\"6394\">The real power in this title comes<em>\u00a0not from what is on the page<\/em>\u00a0but\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Philosophy_of_language\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/en.wikipedia.org\/wiki\/Philosophy_of_language\" data-><em>what<\/em>\u00a0<em>feelings<\/em><em>and associations it creates in the reader\u2019s mind<\/em><\/a>.<\/p>\n<p id=\"d7ca\">We bring our own ideas, life experiences and feelings to everything we read. Without that context, a machine can\u2019t figure out the higher order understandings that make this title incredible.<\/p>\n<p id=\"fa68\">For example,\u00a0<em>a bird is born to fly<\/em>. That is its\u00a0<em>primary purpose<\/em>. Yet, the bird is restricted from what it\u2019s\u00a0<em>designed to do<\/em>. It\u2019s stripped of its reason for living and so it sings in its desperation and fever. It sings because it wants to be free, to soar and see the world as bird is meant to and so sadness saturates this title. Of course, if\u00a0<a href=\"https:\/\/www.youtube.com\/watch?v=FcwZm5WuKdQ\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.youtube.com\/watch?v=FcwZm5WuKdQ\" data->you know the author\u2019s history and the tragedies in her own life<\/a>, then you can see why she chose it as the title of her autobiography.<\/p>\n<p id=\"4417\">This is something that simply can\u2019t be teased out by using a clustering detection algorithm. It has comes from your\u00a0<em>associative understanding<\/em>.<\/p>\n<p id=\"dc27\">But all is not lost!<\/p>\n<p id=\"58f8\">Let\u2019s take a look at another great title and see if we can pick up more meaning from only what\u2019s there in front of us.<\/p>\n<p id=\"52d2\">2)\u00a0<strong>Midnight in the Garden of Good and Evil<\/strong><\/p>\n<p id=\"ff6b\">This title is easier for a basic algorithm to work through. It has several obvious poetic techniques, such as alliteration, which is a repetition of consonant sounds like \u201cg.\u201d Since this has actual alliteration, as opposed to only associated alliteration, the system should be able to pick this kind of pattern up.<\/p>\n<p id=\"c1fb\">It also has what I call the \u201cunion of opposites.\u201d You tend to find this kind of dynamic juxtaposition in famous titles like \u201cThe Song of\u00a0<strong>Ice<\/strong>\u00a0and\u00a0<strong>Fire\u201d,<\/strong>\u00a0or \u201c<strong>Pretty<\/strong>\u00a0Little\u00a0<strong>Monsters\u201d,\u00a0<\/strong>or even historical events like<strong>\u00a0<\/strong>\u201c<strong>War\u00a0<\/strong>of the\u00a0<strong>Roses<\/strong>\u201d. Flowers and destruction are not precise opposites but one could easily be considered to have a positive sentiment (roses) and the other negative (war). Some great titles are built on this principle alone, like\u00a0<strong>War<\/strong>\u00a0and\u00a0<strong>Peace<\/strong>.<\/p>\n<p id=\"b45f\">It also uses evocative and sentimental words like \u201cmidnight\u201d and \u201cgarden\u201d. These words create picturesque images in the reader\u2019s mind, both frightening and beautiful. A system could easily be designed to understand these emotionally charged words, because marketers have been picking out \u201cpower words\u201d for a hundred years.<\/p>\n<h3 id=\"93a0\">When Doves\u00a0Cry<\/h3>\n<p id=\"5d2f\">Ambiguity is a very hard to deal with for NLP systems and yet\u00a0<em>it\u2019s at the very heart of what makes for great writing, in particular fiction, literature, film and poetry!<\/em><\/p>\n<p id=\"bb00\">It\u2019s one thing to grasp the deep structure of how a basic sentence is constructed. If you were unlucky enough to live through sentence diagramming in grade school you learned how to slice up a sentence into its component parts. But while this might be interesting to teachers, editors and math peeps, you might be surprised to find that to a writer it\u2019s plain old torture.<\/p>\n<p id=\"ebb0\">I hated sentence diagramming!<\/p>\n<p id=\"9f11\">That\u2019s because my fellow authors and I understand that the true power of words comes from somewhere else. It\u2019s one thing to detect parts of speech. It\u2019s completely different to detect what makes a phrase that sets a person\u2019s heart on fire.<\/p>\n<p id=\"887f\">Sentence diagramming does not a writer make.<\/p>\n<p id=\"4756\">Even that sentence is not something a machine could comprehend. It\u2019s basically bad grammar. And yet by using it, it forces you to stop and notice it. You have to pause for a split second to process it, even if that happens at an unconscious level. If I did that at a key moment in the plot of a great novel that I wanted you to pay close attention to, you might stand more of a chance of picking up on it as a reader.<\/p>\n<h3 id=\"bb37\"><strong>And That\u2019s\u00a0That<\/strong><\/h3>\n<p id=\"d0eb\">OK, so maybe we didn\u2019t get the next great title for book two of\u00a0<a href=\"http:\/\/amzn.to\/2oxO5O1\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/amzn.to\/2oxO5O1\" data-><strong>The Jasmine Wars<\/strong><\/a>. Eventually I created my own title, based on a line from a Chinese sci-fi story in translation called\u00a0<a href=\"http:\/\/amzn.to\/2oy102b\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/amzn.to\/2oy102b\" data->Invisible Planets<\/a>,\u00a0<a href=\"http:\/\/amzn.to\/2nD8VfE\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/amzn.to\/2nD8VfE\" data-><strong>Through the Darkening Sky<\/strong><\/a><strong>.\u00a0<\/strong>I wanted<strong>\u00a0<\/strong>to evoke the image of a storm that\u2019s coming, that you can\u2019t escape. You can only hold on and go through it. By the way, here\u2019s the new cover below. I love the artwork, done by an amazing artist\u00a0<a href=\"https:\/\/www.artstation.com\/artist\/neisbeis\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.artstation.com\/artist\/neisbeis\" data->Ignatio Bazan Lazcano<\/a>.<\/p>\n<figure id=\"2ea3\" data-scroll=\"native\"><canvas width=\"53\" height=\"75\"><\/canvas><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/540\/1*Vmiafmfd76mKaHWfsPQpVg.jpeg\" data-src=\"https:\/\/cdn-images-1.medium.com\/max\/540\/1*Vmiafmfd76mKaHWfsPQpVg.jpeg\" \/><\/figure>\n<p id=\"429d\">But don\u2019t let my failed automagical title generator experiment hold you back from diving into NLP!<\/p>\n<p id=\"2bce\">The field is currently enjoying billions of dollars in research and a true renaissance as it powers more and more essential apps like Google Translate, digital assistants and\u00a0<a href=\"http:\/\/www.sciencealert.com\/these-new-earbuds-can-translate-languages-for-you-in-real-time\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/www.sciencealert.com\/these-new-earbuds-can-translate-languages-for-you-in-real-time\" data-><strong>even<\/strong>\u00a0<strong>real-time translation engines that live inside ear buds<\/strong><\/a>. It\u2019s critical that we teach machines to understand us better.<\/p>\n<p id=\"79bd\">If you want to learn more, then head on over to\u00a0<a href=\"http:\/\/cs224d.stanford.edu\/syllabus.html\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/cs224d.stanford.edu\/syllabus.html\" data->the Stanford course on NLP<\/a>. Or just keep hammering through some of the blogs in this article. I won\u2019t lie: It\u2019s not an easy subject. As we saw, language and math are often at odds. They seem to exist in different parts of the brain, which is why people often do well on only one part of the SATs, either math or English.<\/p>\n<p id=\"bcfe\">And yet, there is a strange unity to language and math. They\u2019re woven together in unexpected ways, like\u00a0<a href=\"http:\/\/grammar.yourdictionary.com\/style-and-usage\/rules-for-writing-haiku.html\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/grammar.yourdictionary.com\/style-and-usage\/rules-for-writing-haiku.html\" data->the 5\u20137\u20135 rule of Haiku<\/a>. I can\u2019t help but think that tomorrow\u2019s systems might discover all kinds of hidden patterns as they work their way through the great art and literature of the past. Perhaps there\u2019s a hidden pattern beneath the sea of words that goes so deep that even a writer can\u2019t sense it, except in his dreams.<\/p>\n<p id=\"32f7\">And maybe, just maybe, there\u2019s an AI, waiting to be born, that will one day sing the songs that make the whole world sing.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Ready to learn Artificial Intelligence? Browse courses\u00a0like\u00a0 Uncertain Knowledge and Reasoning in Artificial Intelligence developed by industry thought leaders and Experfy in Harvard Innovation Lab. After discovering\u00a0the amazing power of convolutional neural networks for image recognition\u00a0in part five of this series, I decided to dive head first into\u00a0Natural language Processing or NLP. (If you missed<\/p>\n","protected":false},"author":393,"featured_media":3391,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[183],"tags":[97],"ppma_author":[2209],"class_list":["post-1421","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-ml","tag-artificial-intelligence"],"authors":[{"term_id":2209,"user_id":393,"is_guest":0,"slug":"daniel-jeffries","display_name":"Daniel Jeffries","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/?s=96&d=mm&r=g","author_category":"","user_url":"","last_name":"Jeffries","first_name":"Daniel","job_title":"","description":"Dan Jeffries is an author, engineer and serial entrepreneur. During his two decades in the computer industry, he&#039;s covered a broad range of tech from Linux to networks and virtualization.&nbsp;"}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1421","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/393"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=1421"}],"version-history":[{"count":0,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1421\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/3391"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=1421"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=1421"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=1421"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=1421"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}