Gyarados, a water type

Gyarados as fire type

Gyarados as grass type

Gyarados as electric type

pokemon2pokemon: Using Neural Networks to Generate Pokemon as Different Elemental Types

June 3, 2019

Have you ever wondered what a Gyarados would look like as a fire type? Or grass type, or electric type?

For my last project at the Recurse Center, I trained CycleGAN, an image-to-image translation model, on images of Pokémon of different types.

Ho-oh, a fire type

Ho-oh as dark type

Model Overview

CycleGAN is an image-to-image translation model that allows us to “translate” from one set of images to another. For more on CycleGAN, see previous blog posts on image-to-image translation with CycleGAN and pix2pix.

The open-source implementation used to train and generate these images of Pokémon uses PyTorch and can be found on Github. For this project, I trained the model to translate between sets of Pokémon images of different types, e.g. translating images of water types to fire types.

Training Data

I found the original dataset of Pokémon images and their types on Kaggle, containing Generations 1-7. I wrote a script to sort the Pokémon images by their primary type.

The resulting dataset, as well as the script, can both be found on my Github.

Results

For each pair of images, on the left is the original image of the Pokemon, and on the right is the type-translated version. (Results are best viewed if you turn off f.lux, night shift, or any other display mode that changes the color of your screen.)

Water type -> other types

Dewgong, water type

Dewgong as grass type

Lapras, water type

Lapras as grass type

Azumarill, water type

Azumarill as grass type

Kingdra, water type

Kingdra as grass type

Clawitzer, water type

Clawitzer as fire type

Empoleon, water type

Empoleon as grass type

Greninja, water type

Greninja as grass type

Keldeo, water type

Keldeo as grass type

Cloyster, water type

Cloyster as electric type

Lapras, water type

Lapras as fire type

Kyogre, water type

Kyogre as grass type

Feraligatr, water type

Feraligatr as grass type

Clawitzer, water type

Clawitzer as grass type

Carracosta, water type

Carracosta as fire type

Greninja, water type

Greninja as fire type

Clauncher, water type

Clauncher as grass type

Fire type -> other types

Slugma, fire type

Slugma as dark type

Ponyta, fire type

Ponyta as dark type

Combusken, fire type

Combusken as dark type

Torkoal, fire type

Torkoal as water type

Darmanitan, fire type

Darmanitan as dark type

Delphox, fire type

Delphox as dark type

Simisear, fire type

Simisear as dark type

Pignite, fire type

Pignite as water type

Heatmor, fire type

Heatmor as electric type

Ho-oh, fire type

Ho-oh as electric type

Rapidash, fire type

Rapidash as dark type

Blaziken, fire type

Blaziken as dark type

Flareon, fire type

Flareon as water type

Darmanitan, fire type

Darmanitan as electric type

Delphox, fire type

Delphox as water type

Simisear, fire type

Simisear as water type

Magmortar, fire type

Magmortar as electric type

Talonflame, fire type

Talonflame as water type

Grass type -> other types

Bellossom, grass type

Bellossom as water type

Grovyle, grass type

Grovyle as water type

Maractus, grass type

Maractus as water type

Leafeon, grass type

Leafeon as water type

Sceptile, grass type

Sceptile as water type

Pansage, grass type

Pansage as water type

Electric type -> other types

Electivire, electric type

Electivire as dark type

Thundurus, electric type

Thundurus as fire type

Dragon type -> other types

Latios, dragon type

Latios as grass type

Kyurem, dragon type

Kyurem as dark type

Garchomp, dragon type

Garchomp as dark type

Zekrom, dragon type

Zekrom as fire type

Rayquaza, dragon type

Rayquaza as fire type

Salamence, dragon type

Salamence as fire type

Haxorus, dragon type

Haxorus as fire type

Zygarde, dragon type

Zygarde as fire type

Dark type -> other types

Darkrai, dark type

Darkrai as dragon type

Yveltal, dark type

Yveltal as electric type

Hydreigon, dark type

Hydreigon as fire type

Localhost Talk: creative applications of deep learning, aka, neural networks for fun and not profit :-)

May 16, 2019

Earlier this week I gave a talk at Localhost, the Recurse Center’s public-facing technical speaker series. Slides embedded below. Here’s also a link to the talk slides if you want to see my notes included.

The talk covers some of the creative deep learning projects I’ve worked on while at RC:

generating jazz with an LSTM [+ github]
generating punchlines to jokes with seq2seq [+ github]
generating maps and buildings using circuit boards with pix2pix
neural style transfer [+ github]
translating between Pokemon types with CycleGAN [+ dataset] (blog post and repo link to come!)

Overall I received a lot of enthusiastic positive feedback and felt pretty good about how it went! I do feel somewhat proud of all of the fun projects I was able to explore while at RC, and it feels nice to be able to share that with others.

samoyed2bernese: Using CycleGAN for Image-to-Image Translation between Samoyeds and Bernese Mountain Dogs

April 19, 2019

Dogs!!! More dogs this week!!! Is it possible I picked this project because I was in the mood for dog pictures? Absolutely.

This week, I used the CycleGAN image-to-image translation model to translate between images of Samoyeds and Bernese mountain dogs, two of my favorite dogs. If you’re not familiar with these breeds, you’re in luck, because here are some dog pictures for your reference. (Such good dogs!!)

Model Overview

CycleGAN builds off of the pix2pix network, a conditional generative adversarial network (or cGAN) that can map paired input and output images. Unlike pix2pix, CycleGAN is able to train on unpaired sets of images. For more on pix2pix and CycleGAN, see my previous blog post here.

The CycleGAN implementation used to train and generate dog pictures uses PyTorch and can be found on Github here. (This repo also contains a pix2pix implementation, which I had used previously to generate circuit cities.)

A major strength of CycleGAN over pix2pix is that your datasets can be unpaired. For pix2pix, you may have to really dig, curate, or create your own dataset of 1-to-1 paired images. For example, if you wanted to translate daytime photos to nighttime photos with pix2pix, you would need a pair of daytime and nighttime photos of the same location. With CycleGan, you can just have a set of daytime photos of any location and a set of nighttime photos of any location and call it a day (no pun intended).

Another strength of CycleGAN over, say, neural style transfer, is that the translations can be localized. In the following examples, you’ll see that the translation applies only to the dog. Object recognition is implied, and the non-dog portions of the images are not really affected. With neural style transfer, you’re applying a style transformation to the entire image.

As an aside, I originally ran CycleGAN on a set of images of forests, and a set of images of forest paintings. While the results did turn out as expected, I realized this kind of task is really best suited for neural style transfer. (Which inspired me to implement it from scratch! See previous blog post on implementing neural style transfer from scratch in PyTorch.)

Training Data

To train the model, I used 218 images of Samoyeds and 218 images of Bernese mountain dogs from one of my favorite datasets currently on the internet: the Stanford Dogs Dataset. So many good dogs!! My heart!!

GPU training time took a couple of hours on an NVIDIA GeForce GTX 1080 Ti, and generating results only took a few minutes.

Results

In the following examples, on the left is the input, a real photo of a Samoyed. On the right is the CycleGAN output, a generated image translated from the input into a Bernese mountain dog.

Notes

Note that, since this is a blog post and not a scientific paper, I’ve only included the more effective results in this post. For example, bernese2samoyed doesn’t look quite as good — it just looks like white-out was applied to the dog lol.

I would add that a major strength of cycleGAN is that the changes are applied locally, and not to the entire image. The network is able to identify the boundaries of dog and not-dog.

Another note is that this approach seems to work best when translating between inputs with similar shapes. In these results, mainly the coloring was transferred, and not so much the dog shape. I would posit that breeds that are similar in shape would yield more effective results, e.g. translating between golden and chocolate labs, or between tabby cats and tortoise shell cats.