Gyarados, a water type

Gyarados as fire type

Gyarados as grass type

Gyarados as electric type

pokemon2pokemon: Using Neural Networks to Generate Pokemon as Different Elemental Types

June 3, 2019

Have you ever wondered what a Gyarados would look like as a fire type? Or grass type, or electric type?

For my last project at the Recurse Center, I trained CycleGAN, an image-to-image translation model, on images of Pokémon of different types.

Ho-oh, a fire type

Ho-oh as dark type

Model Overview

CycleGAN is an image-to-image translation model that allows us to “translate” from one set of images to another. For more on CycleGAN, see previous blog posts on image-to-image translation with CycleGAN and pix2pix.

The open-source implementation used to train and generate these images of Pokémon uses PyTorch and can be found on Github. For this project, I trained the model to translate between sets of Pokémon images of different types, e.g. translating images of water types to fire types.

Training Data

I found the original dataset of Pokémon images and their types on Kaggle, containing Generations 1-7. I wrote a script to sort the Pokémon images by their primary type.

The resulting dataset, as well as the script, can both be found on my Github.

Results

For each pair of images, on the left is the original image of the Pokemon, and on the right is the type-translated version. (Results are best viewed if you turn off f.lux, night shift, or any other display mode that changes the color of your screen.)

Water type -> other types

Dewgong, water type

Dewgong as grass type

Lapras, water type

Lapras as grass type

Azumarill, water type

Azumarill as grass type

Kingdra, water type

Kingdra as grass type

Clawitzer, water type

Clawitzer as fire type

Empoleon, water type

Empoleon as grass type

Greninja, water type

Greninja as grass type

Keldeo, water type

Keldeo as grass type

Cloyster, water type

Cloyster as electric type

Lapras, water type

Lapras as fire type

Kyogre, water type

Kyogre as grass type

Feraligatr, water type

Feraligatr as grass type

Clawitzer, water type

Clawitzer as grass type

Carracosta, water type

Carracosta as fire type

Greninja, water type

Greninja as fire type

Clauncher, water type

Clauncher as grass type

Fire type -> other types

Slugma, fire type

Slugma as dark type

Ponyta, fire type

Ponyta as dark type

Combusken, fire type

Combusken as dark type

Torkoal, fire type

Torkoal as water type

Darmanitan, fire type

Darmanitan as dark type

Delphox, fire type

Delphox as dark type

Simisear, fire type

Simisear as dark type

Pignite, fire type

Pignite as water type

Heatmor, fire type

Heatmor as electric type

Ho-oh, fire type

Ho-oh as electric type

Rapidash, fire type

Rapidash as dark type

Blaziken, fire type

Blaziken as dark type

Flareon, fire type

Flareon as water type

Darmanitan, fire type

Darmanitan as electric type

Delphox, fire type

Delphox as water type

Simisear, fire type

Simisear as water type

Magmortar, fire type

Magmortar as electric type

Talonflame, fire type

Talonflame as water type

Grass type -> other types

Bellossom, grass type

Bellossom as water type

Grovyle, grass type

Grovyle as water type

Maractus, grass type

Maractus as water type

Leafeon, grass type

Leafeon as water type

Sceptile, grass type

Sceptile as water type

Pansage, grass type

Pansage as water type

Electric type -> other types

Electivire, electric type

Electivire as dark type

Thundurus, electric type

Thundurus as fire type

Dragon type -> other types

Latios, dragon type

Latios as grass type

Kyurem, dragon type

Kyurem as dark type

Garchomp, dragon type

Garchomp as dark type

Zekrom, dragon type

Zekrom as fire type

Rayquaza, dragon type

Rayquaza as fire type

Salamence, dragon type

Salamence as fire type

Haxorus, dragon type

Haxorus as fire type

Zygarde, dragon type

Zygarde as fire type

Dark type -> other types

Darkrai, dark type

Darkrai as dragon type

Yveltal, dark type

Yveltal as electric type

Hydreigon, dark type

Hydreigon as fire type

samoyed2bernese: Using CycleGAN for Image-to-Image Translation between Samoyeds and Bernese Mountain Dogs

April 19, 2019

Dogs!!! More dogs this week!!! Is it possible I picked this project because I was in the mood for dog pictures? Absolutely.

This week, I used the CycleGAN image-to-image translation model to translate between images of Samoyeds and Bernese mountain dogs, two of my favorite dogs. If you’re not familiar with these breeds, you’re in luck, because here are some dog pictures for your reference. (Such good dogs!!)

Model Overview

CycleGAN builds off of the pix2pix network, a conditional generative adversarial network (or cGAN) that can map paired input and output images. Unlike pix2pix, CycleGAN is able to train on unpaired sets of images. For more on pix2pix and CycleGAN, see my previous blog post here.

The CycleGAN implementation used to train and generate dog pictures uses PyTorch and can be found on Github here. (This repo also contains a pix2pix implementation, which I had used previously to generate circuit cities.)

A major strength of CycleGAN over pix2pix is that your datasets can be unpaired. For pix2pix, you may have to really dig, curate, or create your own dataset of 1-to-1 paired images. For example, if you wanted to translate daytime photos to nighttime photos with pix2pix, you would need a pair of daytime and nighttime photos of the same location. With CycleGan, you can just have a set of daytime photos of any location and a set of nighttime photos of any location and call it a day (no pun intended).

Another strength of CycleGAN over, say, neural style transfer, is that the translations can be localized. In the following examples, you’ll see that the translation applies only to the dog. Object recognition is implied, and the non-dog portions of the images are not really affected. With neural style transfer, you’re applying a style transformation to the entire image.

As an aside, I originally ran CycleGAN on a set of images of forests, and a set of images of forest paintings. While the results did turn out as expected, I realized this kind of task is really best suited for neural style transfer. (Which inspired me to implement it from scratch! See previous blog post on implementing neural style transfer from scratch in PyTorch.)

Training Data

To train the model, I used 218 images of Samoyeds and 218 images of Bernese mountain dogs from one of my favorite datasets currently on the internet: the Stanford Dogs Dataset. So many good dogs!! My heart!!

GPU training time took a couple of hours on an NVIDIA GeForce GTX 1080 Ti, and generating results only took a few minutes.

Results

In the following examples, on the left is the input, a real photo of a Samoyed. On the right is the CycleGAN output, a generated image translated from the input into a Bernese mountain dog.

Notes

Note that, since this is a blog post and not a scientific paper, I’ve only included the more effective results in this post. For example, bernese2samoyed doesn’t look quite as good — it just looks like white-out was applied to the dog lol.

I would add that a major strength of cycleGAN is that the changes are applied locally, and not to the entire image. The network is able to identify the boundaries of dog and not-dog.

Another note is that this approach seems to work best when translating between inputs with similar shapes. In these results, mainly the coloring was transferred, and not so much the dog shape. I would posit that breeds that are similar in shape would yield more effective results, e.g. translating between golden and chocolate labs, or between tabby cats and tortoise shell cats.

Circuit Cities with Pix2Pix: Using Image-to-Image Translation with Generative Adversarial Networks to Create Buildings, Maps, and Satellite Images from Circuit Boards

March 6, 2019

I’ve been playing around with generative adversarial networks this week. In particular, using image-to-image translation to see what we can create using images of circuit boards.

I’ve noticed before that circuit boards mildly resemble aerial geospatial images. What kinds of cities could we build with them?

Model Overview

GANs

GAN stands for Generative Adversarial Network: generative, because we are using it to generate data; adversarial, because it comprises of two competing networks; and network, because we are describing a neural network architecture.

Essentially, you have two models competing: a generator that generates fake images, and a discriminator that judges whether an image is fake or real.

First, we generate a bunch of fake images using the generator. Then, we take these fake images to the discriminator, which classifies images as fake or real. Using the information on how the discriminator determined which images are fake, we take that back to the generator so we can generate better fake images. We repeat this process, taking turns training the generator, then the discriminator, until the discriminator can no longer tell which images are real or fake (generated).

Pix2Pix

The pix2pix model uses conditional adversarial networks (aka cGANs, conditional GANs) trained to map input to output images, where the output is a “translation” of the input. For image-to-image translation, instead of simply generating realistic images, we add the condition that the generated image is a translation of an input image. To train cGANs, we use pairs of images, one as an input and one as the translated output.

For example, if we train pairs of black-and-white images (input) alongside the color image (translation), we then have a model that can generate color photos given a black-and-white photo. If we train pairs of day (input) and night (translation) images of the same location, we have a model that can generate night photos from day photos.

CycleGAN

A related model architecture is CycleGAN (original CycleGAN paper), which builds off of the pix2pix architecture, but allows you to train the model without having explicit pairings. For example, we can have one dataset of day images, and one dataset of night images; it’s not necessary to have a specific pairing of a day and night image of the same location. To train CycleGAN, we can use unpaired training data. (CycleGAN is not used here but I hope to explore it more this week!)

The pretrained models I used for these explorations are from a PyTorch implementation of pix2pix that can be found on Github.

Results

In the results below, on the left is the circuit board image input, and on the right is the generated translation.

Circuit Boards to Buildings

For these, I used the facades_label2photo pretrained model, originally trained on paired images like this:

Circuits to Maps

For these, I used the sat2map pretrained model, originally trained on paired of satellite aerial images (input) and Google maps (translation).

Circuits to Satellite Images

For these, I used the map2sat pretrained model, originally trained on paired of Google maps images (input) and satellite aerial images (translation).