How I built an Afrikaans AI Poetry Bot

H

After letting go of my childhood dreams of building my own Terminator (or Chappie or JARVIS) I settled for something that writes (pretty bad) Afrikaans poetry, among other things.

Here’s how I did it.

First, What is AI?

Read ten books about it and you’ll probably end up with ten different definitions. Broadly, however, it is about software’s ability to make sense of data and take the action that has the best chance of achieving a goal. Or in my Terminator example: software and machines that exhibit human intelligence.

If you want a deep dive – especially on the intricacies and differences between machine learning, neural networks and deep learning – here are a two great, free resources:

Even if you’re not a coder or developer (I’m not), you’ll learn a lot from those two alone. But let’s get on to the fun stuff.

Building my poetry bot

Training data

To build this sort of bot, you’d need to train it on a set of training data. I picked something rather obscure (which I’m sure nobody has done before) a collection of poems by one of the leading figures in Afrikaans poetry: C. Louis Leipoldt.

You can find lots of older free books on Project Gutenberg (which you can download in raw text format). 

I stripped the collection from extra data and noise – everything that isn’t poetry like the index and introduction  – and saved the file as leipoldt.txt

Library

You have a few options. 

If you want it to run on your computer (even without a graphics processor – I built some on a Macbook Air), you can use GPT2, but if you want something more user friendly and in the cloud, consider the paid service InferKit.

GPT2-simple is a simplified library that allows you to use the GPT2 model (note that GPT3 has recently been released). GPT2 is a text generating AI system released by OpenAI. It is basically a language model that can write (mostly) coherent text, based on large amounts of text on the internet. 

Here is a famous example of what a prompt (written by a person) and the response (written by the AI):

Note that you will have to install TensorFlow (not the latest version) to run GPT2 or GPT2-simple. Just follow the instructions in the Github doc (see the bottom of this document for more step-by-step instructions).

Download the models and run it!

You can learn more about the GPT2 models and sizes here. I used the second-largest 774M model (around 3gb in size). 

The result

I don’t have any of my locally generated poetry examples, but I took a screen recording of something I ran on InferKit. Click the play button below to see it here (or directly on Twitter)

If you don’t speak Afrikaans: it’s somewhat gibberish, but only somewhat so. You can mostly make sense of the output (it’s not complete garbage, even though it might have grammatical errors and inconsistencies) and it produced completely new lines (so not a direct copy-and-paste of the training data).

So, for my purposes, a success! 🎉

More fun examples

I fed a locally hosted system an input by one of my favourite authors, Douglas Adams.

Me:

For instance, on the planet Earth, man had always assumed that he was more intelligent than dolphins because he had achieved so much—the wheel, New York, wars and so on—whilst all the dolphins had ever done was muck about in the water having a good time. But conversely, the dolphins had always believed that they were far more intelligent than man—for precisely the same reasons.

AI:

In fact, this may well be true: dolphins are far more advanced than humans at every level of intelligence. But, in fact, they were much more advanced than men.

In an age where more and more people are interested in a certain set of ideas, we often see people arguing that certain types of intelligence have been given to men as gifts—for example, that we can learn from a certain kind of man to become one of the many humanoids. When the scientific consensus is to make man a “humanoid” and then turn it into a “humanoid,” it is difficult to get there without saying that it is difficult to know how to explain or demonstrate a certain kind of intelligence.

Some people who say that we need to understand the nature of intelligence would say that human beings are very far from an advanced species of creature.

Here’s a screen grab I took of a slightly more terrifying example (showing the potential of AI to generate believable fake news).

(The “Model prompt” is what I wrote, the “Sample 1” is the AI-generated text).

This is part of the reason why OpenAI initially said that GPT2 might be “too dangerous to release” (I’m pretty sure building up hype was another reason). 

Installation tips

  1. Clone the Git repository and ‘cd’ into the directory

git clone https://github.com/openai/gpt-2.git && cd gpt-2

  1. Follow instructions for native installation.
  2. Install tensorflow 1.12 (not the latest version). Make sure to use the GPU version if your system has a GPU, it will be infinitely faster (mine doesn’t, but it still works).

pip3 install tensorflow==1.12.0 or pip3 install tensorflow-gpu==1.12.0

  1. Install all the other python packages:

pip3 install -r requirements.txt

  1. Download the model. I used the 774M model, alternatively you can use 124M, 355M or 1558M

python3 download_model.py 774M

  1. Run it! As the warning says: Samples are unfiltered and may contain offensive content. Samples can be conditional or unconditional

To generate unconditional samples from the small model:

python3 src/generate_unconditional_samples.py | tee /tmp/samples

To give the model custom prompts, you can use:

python3 src/interactive_conditional_samples.py --top_k 40

Thanks for reading!

Let me know if you have any poetic, funny or scary examples of your own to share.

About the author

Werner van Rooyen

Formerly Business Development and Marketing at Luno (where we went from eight nerds in a tiny office to hundreds of people spread over three continents) and before that Marketing at PayFast. Currently investing, paragliding, and doing research, mostly in Mexico.

Add comment

By Werner van Rooyen

About me

Werner van Rooyen

Formerly Business Development and Marketing at Luno (where we went from eight nerds in a tiny office to hundreds of people spread over three continents) and before that Marketing at PayFast. Currently investing, paragliding, and doing research, mostly in Mexico.