Autoencoders

NN can generate new images? Creativity? Random numbers?

Autoencoders are NN with two main characteristics:

  • The input and output layers have the same dimensions (hidden layers preferably lesser)

  • It is a unsupervised model, so no labels

An autoencoder is, in fact, composed by two concatenated NN, generally trained as a whole:

  • Encoder: tries to get a selected set of features (latent space) by compressing the input into a lesser dimensional space

  • Decoder: tries to reconstruct the original input as close as possible

Some applications: denoising images, anomalies detection, recommendation engines, generative models etc.

The layer in red is also known as LATENT SPACE
The latent space becomes a kind of compression - or we can think about it as a way to get the 'essence' of the element - uhmmm.... so, somehow NN can retain 'essence' of things? and other semantic features from images or videos?
Example of denoising use of autoencoders - pay attention, since this can help in order to understand how stable diffusion models work

Question: What can happen if we train an autoencoder with thousands of faces, where we use the same image as input and output and force the NN to get into the latent space the best way te get the essence of a face?

And what happens if afterwards, we use just the decoder part, by using a new latent space generated by random numbers???

We will be creating new faces !!! Isn't this creativity????

Example use of autoencoders as generative models

Variational autoencoders works by making the latent space more predictable, more continuous, less sparse.

They do that by forcing latent variables to become normally distributed, so VAE gain control over the latent space.

VAE allows the similar elements in latent space to be closer, and make much easier to generate new 'good' variations in generative modeling.

Last updated