Catch me if you can: A simple english explanation of GANs or Dueling neural-nets

Ready to learn Artificial Intelligence? Browse courses like Uncertain Knowledge and Reasoning in Artificial Intelligence developed by industry thought leaders and Experfy in Harvard Innovation Lab.

Deep learning GANs is one the biggest breakthrough technologies of 2018, as per MIT Technology Review’s annual list of top 10 tech

“Practice makes perfect”

I’m not so sure about humans, but anyone working on machine learning will agree that practice, or quality training data makes machines perfect. Well, almost.. but definitely by a huge margin, than us mortals.

A perfect AI implementation in any field is indistinguishable from magic. But the trouble is, as machines start learning, their hunger for data is insatiable… like Tantalus from Greek myth, whose thirst & hunger could never be fulfilled.

A data scientist’s days are spent in acquiring (and cleaning) more and more data to feed machines. And their nights are lost in teaching machines learning from all this data, by training models over and over again.

Severe shortcomings in both ‘data’ and ‘training’ take the A & I out of mAgIc, making it meaningless. This, is the biggest bottleneck for AI’s progress today.

But wait.. if machines could take up any human task, why not this one too? Can we make machines learn-to-teach themselves? No, this is not a play of words. And.. Yes, this is doable.

(Pic source: stavos on flickr )

Duelling neural networks

Enter GANs.. or its complex sounding expansion, Generative Adversarial Networks. If Deep learning is the next big thing that’s taking the cake, GAN is the cream on that cake. The possibilities have never looked so exciting!

But, firstly what is a GAN? We’ll try and have rest of this conversation in simple English, without tossing in geeky jargons. So, a strict NO to stuff like ‘probabilities’, ‘perceptrons’, ‘activation’, ‘convolution’ and other gobbledegook.

Let me tell you a story.

Setting off the perfect cat-and-mouse game

Imagine a quintessential movie where two estranged brothers embrace opposing philosophies in life. One starts a fresh underworld operation printing fake currencies as a ‘manipulator’, and the other enrols in a bureau to set up a new division that detects counterfeits as an ‘enforcer’.

To start with, lets say that the ‘manipulator’ in underworld starts with a disadvantage of knowing nothing about what original currencies look like. The ‘enforcer’ in the bureau knows just basics of how few real currencies look.

And then the game begins.

The manipulator starts printing, but the initial fakes are terrible. It doesn’t need even a trained eye to detect the counterfeits, and promptly every single one of them is detected by the enforcer.

The manipulator is industrious and keeps churning out fakes, while also learning what didn’t work in previous attempts. By sheer magnitude of experimentation with fakes & some feedback, the quality of counterfeits slowly starts inching up (of course, assuming the operation is not shut down!)

Eventually, the manipulator starts getting a few random counterfeits right and this goes undetected by the enforcer. So, its learning time on the other side and the enforcer takes lessons on detecting these smarter counterfeits.

With the enforcer getting smarter, the counterfeits are detected again. The manipulator has no choice but to upgrade the counterfeiting operation to create more genuine-looking fakes.

This continuous game of cat-and-mouse continues, and ends up making experts out of both the manipulator and enforcer. So much so that the counterfeits are indistinguishable from the genuine ones, and also the detection of such ingenious fakes becomes almost uncanny.

You get the drift. And this, is the underlying concept of GANs.

Photos ©Dreamworks

Generative Adversarial Network, in context

Lets now get our story and actors translated, in the context of GANs.

GANs — a schematic flow with the key players

Both the manipulator and enforcer are models, a variant of the Deep learning neural networks.

The manipulator is called the ‘Generator network’ which is tasked with the job of creating training data, starting randomly and getting as realistic as it can. The enforcer is the ‘Discriminator network’, whose job is to detect and classify these as ‘real’ or ‘fake’, and become pretty good at it.

By pairing two models against each other as adversaries, we set them up for a healthy competition. Each tries mastering its own job across thousands of iterations, with no manual intervention. And voila, we end up with true-looking fakes and also a model that can detect most con-jobs.

And this is why GANs are such a master stroke in AI since they solve both the real-world problems of generating ‘data’ when you don’t have enough to start with, and ‘training’ models with no manual intervention, a form of unsupervised learning.

Atleast that’s where they are headed, and they are already operational. Over the past couple of years, there have been steady advancement of GANs with hundreds of variants created, and many more innovations underway.

Generative Adversarial Networks is the most interesting idea in the last ten years in machine learning. — Yann LeCun, Director, Facebook AI

Whats the utility of GANs?

What worldly good might come out of a perfect currency-printing machine or something conceptually similar? Apparently plenty, lets look at 3 broad areas.

1. Creative pursuits

Its incredible to imagine that machines have finally unlocked their right brains. After all, who wouldn’t be surprised when a nerdy programmer suddenly starts penning award-winning poetry.

With a new-found approach to mimic real images, GANs have started creating imaginary celebrities, or new masterpieces that bear distinctive signature of artists. The potential usecases with this ability spans creative disciplines.

Imaginary Celebrities: Nvidia GANS model generated images using Celeb faces dataset as reference . (Paper)

2. Translating text

Suppose you want to find out how a person would look without their glasses or with a new hairdo, you just ask to have this created. Not very different from asking for the day’s weather or mapping your upcoming commute.

By creating new flora and fauna to user specification of a short description, GANs have been dutifully granting the demands, just like a wish-fulfilling genie. Pity they couldn’t breathe life into the creations.. atleast not yet.

Text to Image synthesis (Paper: https://arxiv.org/abs/1612.03242)

3. Generate training data

GANs do the heavy lifting of creating tons of training data, which can shift AI into the fast-lane of progress. Imagine GANs spawning realistic 3D worlds similar to ours, with millions of miles of roads & all possible traffic scenarios.

Rather than a self-driving car or drone getting trained in the real-world and causing horrendous accidents, they could get trained in these virtual worlds and become expert drivers. With GPU computing, this can be instantaneous.

While these are directional applications, GANs have already been applied to high-impact business applications like drug discovery, and there are literally hundreds of use cases in early stages of experimentation.

While this might already sound revolutionary, the best in GANs is yet to come. The purpose of this article is to share a simple, inclusive tutorial for spreading awareness about this important technology. Now, let’s wait for the magic to be unfurled!