Why GANs are so exciting

Jay Parthasarthy
4 min readJun 19, 2018

Machine learning, as a technology that is figuratively “post inflection-point”, continues to see an exponential increase in the rate of innovation. And even within a field that experiences radical change over periods of months, generative networks are one of the fastest moving areas.

Yann Lecun famously said that generative networks are “perhaps the most exciting development in machine learning in the past 20 years.” Now, he said this from a research perspective. But in the applied machine learning world, GANs are racing towards deployment: in we’ll see some really interesting and revolutionary tech come out in the next few years.

Let’s take a look at some reason as to why GANs are so interesting.

In a broad sense, GANs work by training two parallel models:

  • The generator tries to create realistic looking outputs. It tries to fool:
  • The discriminator, which is trained to differentiate between fake and real outputs.

There’s a constant battle between the generator and the discriminator going on when GANs are being trained. They’re locked in a figurative “arms race” for who can

Training a GAN continues until either both networks only get incrementally better and their given tasks, or the generator becomes so good at what it’s doing that the discriminator cannot e

There’s another way to look at what’s occurring in a GAN besides an arms race: it’s kind of like a teacher-student dynamic.

The discriminator can kind of be thought of as the “teacher”, who tells the student (the generator) where to improve. Now, we all know that it’s much easier to improve when you have an expert to correct your low-level mistakes. And GANs attempt to bring the same type of idea into machine learning.

In this way, GANs end up being a lot more flexible than traditional loss-based training in machine learning. When you were learning how to write an essay, you probably didn’t try to figure out what characters would make up the “optimal essay” and then work backwards. Instead, you wrote a draft, got help, revised it, and repeated until you had something that was good.

GANs try to learn in a very different way that traditional ML algorithms, and that may or may not be a positive or a negative. In my opinion, it’s much more of a human-motivated method than typical loss minimization, which may lead to advances in the learning protocol.

GANs also have “creativity”- they tend to learn about the general tendencies of the data that they’re trained upon, so they can actually extrapolate based on these learned features. Supervised learning approaches are notoriously bad at extrapolation. Because a generator is asking: “how can I best fool this discriminator?”, it doesn’t have to directly learn from the sample data. It actually has the ability to come up with novel solutions that imitate the sample data: this actually has the ability to be outside of the original feature space.

None of these celebrities actually exist.

In this way, the generator kind of acts like an encoding, or summary, of the input data that it was trained on. The features in a generator are limited, and the size of the GAN is almost certainly much smaller than the training data. So the GAN has to try to summarize general tendencies of the input data.

GAN for ImageNet Dataset.

Above, we can see a GAN trying to generate images based on the ImageNet dataset. In essence, what it has to do is compress 200GB of data into 50MB of weights, so it has to learn the most salient features of the data. For example, it has to

This concept of a GAN acting as an embedding is driven home by the idea of feature arithmetic. Representational learning has always been at the core of deep learning, and good embeddings are a core goal of the space. Word2Vec, and good word encodings, are the reason that NMT works as well as it does today. A prominent feature of Word2Vec’s vector encodings was its ability to do arithmetic between words: in vector representations: ‘Boy’ — ‘Girl’ + ‘Nephew’ = ‘Niece.’ These types of equivocations allowed machine learning to perform well in the problem space.

It turns out that you can actually perform similar types of arithmetic with GANs! It’s called feature arithmetic. For example, if you take smiling woman — neutral woman + neutral man, you’ll get smiling man.

Why are these embeddings so important in machine learning? Well, if machines aren’t able to relate features to one another, what they learn about one specific feature will not generalize, and the model will converge much more slowly. If we can understand what features are truly salient for a given dataset by applying GANs, it will almost certainly lead us towards better embeddings, and in turn, models.

These are just some of the research areas that look really interesting for GANs, and we’ll also soon see really interesting applications of them.

Personally, I’m really hopeful for GANs in a wide variety of situations.

--

--