Semi-Supervised Learning

Outline:

Semi-Supervised Classification with GANs
From the Scratch

Semi-Supervised Classification with GANs

A much more generally useful application of GANs is
- semi-supervised learning, where we actually improve the performance of a classifier using a GAN.
Many more current products and services use classification than generation.
- Object recognition models
  - based on deep learning often achieve superhuman accuracy after they have been trained.
  - Modern deep learning algorithms are not yet anywhere near human efficiency during learning.
People are able to learn from very few examples provided by a teacher.
- But that’s probably because people also have all kinds of sensory experience that doesn’t come with labels.
- We don’t receive labels for most of our experiences.
- And we have a lot of experiences that don’t resemble anything
  - that a modern deep learning algorithm gets to see in its training set.
- One path to improving the learning efficiency of deep learning models is semi-supervised learning.
Semi-supervised learning
- can learn from the labeled examples like usual.
- But it can also get better at classification, by studying unlabelled examples
  - even though those examples have no class label.
- Usually, it is much easier and cheaper to obtain unlabeled data than to obtain labeled data.

To do semi-supervised classification with GANs
- we’ll need to set up the GAN to work as a classifier.
- GANs contain two models, the generator and the discriminator.
  - Usually we train both and then throw the discriminator away at the end of training.
  - We usually only care about using the generator to create samples.

The discriminator
- For semi-supervised learning focus on the discriminator rather than the generator.
- We’ll extend the discriminator to be our classifier
  - and use it to classify new data after we’re done training it.
- We can actually throw away the generator, unless we also want to generate images.
- So far a discriminator net with one sigmoid output, gives us the probability
- We can turn this into a softmax with two outputs,
  - one corresponding to the real class
  - one corresponding to the fake class

Training
- Now we can train the model using the sum of two costs.
  - For the examples that have labels, we can use the regular supervised cross entropy cost.
  - For all of the other examples and also for fake samples from the generator, we can add the GAN cost.
- To get the probability that the input is real,
  - we just sum over the probabilities for all the real classes.
  - Normal classifiers can learn only on labeled images.
- This new setup can learn on
  - labeled images
  - real unlabeled images
  - and even fake images from the generator.
- Altogether this results in very low error on the test set
  - because there are so many different sources of information even without using many labeled examples.
  - To get this to work really well, we need one more trick called feature matching.
- Feature matching
  - The idea of feature matching is
    - to add a term to the cost function for the generator,
    - penalizing the mean absolute error between
      - the average value of some set of features on the training data,
      - and the average value of that set of features on the generated samples.
  - The set of features can be any group of hidden units from the discriminator.
So semi-supervised learning still as some catching up
- to do compared to the brute force approach of just gathering tons and tons of labeled data.
- Usually, labeled data is the bottleneck
  - that determines which tasks we are or aren’t able to solve with machine learning.
- Hopefully using semi-supervised GANs,
  - you’ll be able to tackle a lot of problems that weren’t possible before.

From the Scratch

From the Scratch