Artificial Intelligence in Plain English

New AI, ML and Data Science articles every day. Follow to join our 3.5M+ monthly readers.

Follow publication

Fake Anime Characters Using Deep Convolutional GANs

--

Final outputs from my project :)

Project Overview

Checkout My YouTube Video

Introducing The Paper

From the paper

The Nub

We propose and evaluate a set of constraints on the architectural topology of Convolutional GANs that make them stable to train in most settings. We name this class of architectures Deep Convolutional GANs (DCGAN)

We visualize the filters learnt by GANs and empirically show that specific filters have learned to draw specific objects

After extensive model exploration we identified a family of architectures that resulted in stable training across a range of datasets and allowed for training higher resolution and deeper generative models

— Authors

The Working

Directly applying batch norm to all layers, however, resulted in sample oscillation and model instability. This was avoided by not applying batch norm to the generator output layer and the discriminator input layer

— Authors

Understanding Transposed Convolutions

=== Input ===
|0 1|
|2 3|
=== Kernel ===
|0 1|
|2 3|
=== 0 * Kernel ===
|0 0 -|
|0 0 -|
|- - -|
=== 1 * Kernel ===
|- 0 1|
|- 2 3|
|- - -|
=== 2 * Kernel ===
|- - -|
|0 2 -|
|4 6 -|
=== 3 * Kernel ===
|- - -|
|- 0 3|
|- 6 9|
=== 0 * Kernel ===
|0 0 0|
|0 0 0|
|0 0 0|
=== 1 * Kernel ===
|0 0 1|
|0 2 3|
|0 0 0|
=== 2 * Kernel ===
|0 0 0|
|0 2 0|
|4 6 0|
=== 3 * Kernel ===
|0 0 0|
|0 0 3|
|0 6 9|
=== Outputs ===
|0 0 1|
|0 4 6|
|4 12 9|
Implementing Transposed CNN for single channel from scratch

Padding in Transposed Convolutions

Implementing Transposed CNN with padding for single channel from scratch

Strides in Transposed Convolutions

=== Input ===
|0 0 1|
|0 0 0|
|2 0 3|
Implementing Transposed CNN with strides for single channel from scratch
Source

Transposed Convolution is not Deconvolution

Deconvolution is the process of filtering a signal to compensate for an undesired convolution. The goal of deconvolution is to recreate the signal as it existed before the convolution took place — Source

Model Initialization

Continuing To The Project

Anime faces from the dataset

Creating The DataLoader

Creating The Discriminator

The discriminator
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
Conv2d-1 [-1, 64, 64, 64] 3,136
LeakyReLU-2 [-1, 64, 64, 64] 0
Conv2d-3 [-1, 128, 32, 32] 131,072
BatchNorm2d-4 [-1, 128, 32, 32] 256
LeakyReLU-5 [-1, 128, 32, 32] 0
Conv2d-6 [-1, 256, 16, 16] 524,288
BatchNorm2d-7 [-1, 256, 16, 16] 512
LeakyReLU-8 [-1, 256, 16, 16] 0
Conv2d-9 [-1, 512, 8, 8] 2,097,152
BatchNorm2d-10 [-1, 512, 8, 8] 1,024
LeakyReLU-11 [-1, 512, 8, 8] 0
Conv2d-12 [-1, 1024, 4, 4] 8,388,608
BatchNorm2d-13 [-1, 1024, 4, 4] 2,048
LeakyReLU-14 [-1, 1024, 4, 4] 0
Conv2d-15 [-1, 1, 1, 1] 16,385
Flatten-16 [-1, 1] 0
================================================================
Total params: 11,164,481
Trainable params: 11,164,481
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.19
Forward/backward pass size (MB): 9.63
Params size (MB): 42.59
Estimated Total Size (MB): 52.40
----------------------------------------------------------------

Creating The Generator

The generator
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
ConvTranspose2d-1 [-1, 1024, 4, 4] 2,097,152
BatchNorm2d-2 [-1, 1024, 4, 4] 2,048
ReLU-3 [-1, 1024, 4, 4] 0
ConvTranspose2d-4 [-1, 512, 8, 8] 8,388,608
BatchNorm2d-5 [-1, 512, 8, 8] 1,024
ReLU-6 [-1, 512, 8, 8] 0
ConvTranspose2d-7 [-1, 256, 16, 16] 2,097,152
BatchNorm2d-8 [-1, 256, 16, 16] 512
ReLU-9 [-1, 256, 16, 16] 0
ConvTranspose2d-10 [-1, 128, 32, 32] 524,288
BatchNorm2d-11 [-1, 128, 32, 32] 256
ReLU-12 [-1, 128, 32, 32] 0
ConvTranspose2d-13 [-1, 64, 64, 64] 131,072
BatchNorm2d-14 [-1, 64, 64, 64] 128
ReLU-15 [-1, 64, 64, 64] 0
ConvTranspose2d-16 [-1, 3, 128, 128] 3,075
Tanh-17 [-1, 3, 128, 128] 0
================================================================
Total params: 13,245,315
Trainable params: 13,245,315
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.00
Forward/backward pass size (MB): 12.38
Params size (MB): 50.53
Estimated Total Size (MB): 62.90
----------------------------------------------------------------

Initializing Parameters

Initializing parameters

Defining Loss Functions

Adversarial loss

Defining Optimizers

Initializing optimizers

Training The Model

Final Outputs

Pretty cool to be honest, considering how simple the models were

Noise Interpolation

[((1.0 - (i/k)) * noise1) + ((i/k) * noise2) for i in range(k + 1)]
These images are really good for the most part

Conclusion

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Published in Artificial Intelligence in Plain English

New AI, ML and Data Science articles every day. Follow to join our 3.5M+ monthly readers.

Written by Rishik C. Mourya

2nd yr Undergrad | ML Engineer at Nevronas.in | Working at Skylarklabs.ai | Web Developer | When you know nothing matters, the universe is yours — Rick Sanchez

No responses yet

Write a response