This week, I implemented a character-level recurrent neural network (or char-rnn for short) in PyTorch, and used it to generate fake book titles. The code, training data, and pre-trained models can be found on my GitHub repo.

Heart in the Dark

Me the Bean

Be the Life

Yours

Model Overview

Diagram of the char-rnn network architecture. Source.

The char-rnn language model is a recurrent neural network that makes predictions on the character level. In contrast, many language models operate on the word level.

Making character-level predictions can be a bit more chaotic, but might be better for making up fake words (e.g. Harry Potter spells, band names, fake slang, fake cities, fantasy terms, etc.). Word-level language models might have an advantage for generating longer pieces of text, like summaries or fiction, as they don’t need to figure out how to spell, in a sense.

There do exist character-word hybrid approaches. For example, the GPT-2 model uses byte pair encoding, an approach that interpolates between the word-level for common sequences and the character-level for rare sequences.

This particular char-rnn implementation is set up to handle multiple categories of text. In this use case, it is able to make predictions for different book genres, e.g. Romance, Fantasy, Young Adult, etc.

Training Data

The training data used for this model is a modified version of a Goodreads data scrape of 20K book titles. I transformed the CSV file into separate text files for the top 30 genres. The resulting split dataset can be found in my Github repo.

GPU training time with this model took about 20 minutes on an NVIDIA GeForce GTX 1080 Ti. Generating samples only takes a few seconds.

Results

The following results are a selected sampling of outputs. Note that I’m mainly including examples that consist of real words, with a few exceptions.

Romance
Heart in the Dark

Years of the Dark

You the Book

The Stove to the Story

Fantasy
Growing the Dark

Book of the Dark

Red Sande

Fiction
In the Bead Store

Jen the Bead

King the Bean

Historical
A to the Bean

Other and Story

Science Fiction
Darke Sers

Voringe

In the Beantire

Mystery
Bed Singe

Kiss of the Dark

Red Story

Classics
A Mander of the Suckers

Gorden the Story of Merica

Childrens
Dark Book of the Story of the Sures of the Surating

Late

Story of the Bean

Paranormal
A Store of the Store

Red Store

Stariss and Storiss

Wind Store

New Adult
Live Me Life

Growing Me

In the Bean

Me the Bean

Poetry
Yours

Me

Erotica
Volle the Story of Men

King of the Dark

Dork of the Dark

Work of the Dark

Bed Storys of the Dark

Your Mind

Biography
Be the Life

On Anger and Of Mand Anger

Comically, there are many book titles that revolve around beans, beads, stores, and darkness. While I did notice some subtle differences between genres, it doesn’t appear to be particularly drastic overall.

Model Overview

The GPT-2 model uses conditional probability language modeling with a Transformer neural network architecture that relies on self-attention mechanisms (inspired by attention mechanisms from image processing tasks) in lieu of recurrence or convolution. (Side note: interesting to see how advancements in neural networks for image and language processing co-evolve.)

The model is trained on about 8 million documents, or about 40 GB of text, from web pages. The dataset, scraped for this model, is called WebText, and is the result of scraping outbound links from Reddit with at least 3 karma. (Some thoughts on this later. See section on “Training Data”)

In the original GPT model, the unsupervised pre-training was used as an initial step, followed by a supervised fine-tuning step for various tasks, such as question answering. GPT-2, however, is assessed using only the pre-training step, without the supervised fine-tuning. In other words, the model performs well in a zero shot setting.

Model Overview

Training Data

Results

Romance

Fantasy

Fiction

Historical

Science Fiction

Mystery

Classics

Childrens

Paranormal

New Adult

Poetry

Erotica

Biography

Model Overview

First Impressions