Dougal J. Sutherland

Gatsby, UCL→TTIC→UBC

(from `thispersondoesnotexist.com`)

(Swipe or arrow keys to move through slides; for a menu to jump; to show more.)

- Start with a bunch of examples:
- Want a model for the data:
- Might want to do different things with the model:
- Find most representative data points / modes
- Find outliers, anomalies, …
- Discover underlying structure of the data
- Impute missing values
- Use as prior (semi-supervised, machine translation, …)
- Produce “more samples”
- …

- Maximum likelihood:
- Equivalent:

- 1987-style generative model of faces (Eigenface via Alex Egg)
- Can do fancier versions, of course…
- Usually based on Gaussian noise loss

- One use case of generative models is inpainting [Harry Yang]:
- loss / Gaussians will pick the
*mean*of possibilities

Generator ()

Discriminator

Target ()

Is this real?

No way!

:( I'll try harder…

⋮

Is this real?

Umm…

- MLCC so far: models ,
- is an
*activation function*: - Classification usually uses log loss (cross-entropy):
- Optimize with gradient descent

- How to specify ?
- ,

- Tricking the discriminator:
- Using the generator network for :
- Can do alternating gradient descent!

Running code from [Salimans+ NeurIPS-16]:

Run 1, epoch 1

Run 1, epoch 2

Run 1, epoch 3

Run 1, epoch 4

Run 1, epoch 5

Run 1, epoch 6

Run 1, epoch 11

Run 1, epoch 501

Run 1, epoch 900

Run 2, epoch 1

Run 2, epoch 2

Run 2, epoch 3

Run 2, epoch 4

Run 2, epoch 5

- What happens when is at its optimum?
- If distributions have densities,
- If stays optimal throughout, tries to minimize which is

- If and have (almost) disjoint support so

Generator ()

Discriminator

Target ()

Is this real?

No way!

:( I don't know how to do any better…

- At initialization, pretty reasonable:::
- Remember we might have
- For usual , is supported on a countable union of

manifolds with dim - “Natural image manifold” usually considered low-dim
- No chance that they'd align at init, so

- Original GANs almost never use the minimax game
- If is near-perfect, near instead of
- When is near-perfect, makes it unstable instead of stuck

is a -Lipschitz *critic function*

Turns out is *continuous*: if , then

- Idea: turn discriminator into a critic
- Need to enforce
- Easy ways to do this are way too stringent
- Instead, control
*on average, near the data* - Specifically: ,

- Make the problem harder so there's no perfect discriminator
- Use , for some independent, full-dim noise
- But…how much noise to add? Also need more samples.
- If and we take , get
- Same kind of gradient penalty!
- Can also simplify to e.g.

- Regular deep nets:
- Spectral normalization:
- is the
*spectral norm* - Guarantees
- Faster to evaluate than gradient penalties
- Not as well understood yet

- Consider distance between distributions of image features
- Features from a pretrained ImageNet classifier
- FID:
- Estimator very biased, small variance

- KID: use Maximum Mean Discrepancy instead
- Similar distance with unbiased, ~normal estimator!

is smoothness induced by kernel

Optimal *analytically*:

1.0 | 0.2 | 0.6 | |

0.2 | 1.0 | 0.5 | |

0.6 | 0.5 | 1.0 |

1.0 | 0.8 | 0.7 | |

0.8 | 1.0 | 0.6 | |

0.7 | 0.6 | 1.0 |

0.3 | 0.1 | 0.2 | |

0.2 | 0.3 | 0.3 | |

0.2 | 0.1 | 0.4 |

- No need for a discriminator – just minimize !
- Continuous loss

Generator ()

Critic

Target ()

How are these?

Not great!

:( I'll try harder…

⋮

MNIST, mix of Gaussian kernels

Celeb-A, mix of rational quadratic + linear kernels

- from pretrained Inception net
- simple: exponentiated quadratic or polynomial

- Don't just use one kernel, use a
*class*parameterized by : - New distance based on
*all*these kernels: - Turns out that
*isn't*continuous: have but - Scaled MMD GANs [Arbel+ NeurIPS-18] correct with a gradient penalty to make it continuous

- “Easy parts” of the optimization done
*in closed form*

If we want to find “more samples like ”:

- Conditional GANs: [Mirza+ 2014]
- Just add a class label as input to and

- BigGAN [Brock+ ICLR-19]: a bunch of tricks to make it huge

Slides (including links to papers) are online:`dougal.me/slides/gans-mlcc`