Generating Death Metal Band names with Recurrent Neural Networks

For anyone familiar with Death Metal you know it’s a genre known for it’s extremes. The music, the names, the subject matter - it’s all extreme! I thought it would be interesting to see what a recurrent neural network would come up with asked to generate it’s own death metal band names.

The real reason that I did this was to code up a LSTM with multiplicative integration and see how well it did with sequence modeling compared to a vanilla LSTM.

But first, the fun part! A selection of some of the generated band names below. A word of caution, due to the nature of death metal some readers might find the generated names offensive.

Sadistic Stench

Joyless Homicide 

Enthrallment Suns 

Torment of Torment 

Psychologist 

Old Scorched 

Words of Death 

Suburbia Ghost of Pain

Gore Disease

Fragile Riot

Wormhorn

Utopia Corpse 

Psychic Inc. 

Burn Me Binder

Rest in Death 

Bloodgasm

Jonas of Man 

The Antisocial Ransatary 

Youth Crushing 

Ancient Torment 

Human Terror Through Darkness 

You 

Morbid Defilement 

Parasitical Mandality

Terrorable Life

Zombification

Human Stench 

Gate of Here 

Unknown Grinder

Inverted Brain

Zombie Strength 

Hate Slug

Just Scream 

LSTM with multiplicative intergration in PyTorch

The real reason that I did this other than to be funny is to see how easy it would be to code up this type of LSTM in PyTorch and see how it’s performance compared to a vanilla LSTM for sequence modeling. I’m happy to report that coding this in PyTorch was a breeze! However, I didn’t really see any improvement over a vanilla LSTM. Most likely this problem wasn’t complex enough to really test the capacity of the MI-LSTM.

Coding the MI-LSTM

In PyTorch we define a forward function that explicitly controls the computation of our network. It specifies the operations that take us from the input tensor to the output tensor, in our case it will be all of the operations that create a MI-LSTM.

MI-LSTM

Appendix A of "On Multiplicative Integration with Recurrent Neural Networks"

To turn these mathematical expressions into a PyTorch model we just have to set up the matricies and specify the computations. The resulting forward function pretty much looks identical to the mathematical notation.

def forward(self, inp, h_0, c_0):
    # encode the input characters
    inp = self.encoder(inp)
    # forget gate
    f_g = F.sigmoid(self.alpha_f * self.weight_fx(inp) * self.weight_fh(h_0) +
                   (self.beta_f1 * self.weight_fx(inp)) + (self.beta_f2 * self.weight_fh(h_0)))
    # input gate
    i_g = F.sigmoid(self.alpha_i * self.weight_ix(inp) * self.weight_ih(h_0) +
                   (self.beta_i1 * self.weight_ix(inp)) + (self.beta_i2 * self.weight_ih(h_0)))
    # output gate
    o_g = F.sigmoid(self.alpha_o * self.weight_ox(inp) * self.weight_oh(h_0) +
                   (self.beta_o1 * self.weight_ox(inp)) + (self.beta_o2 * self.weight_oh(h_0)))
    # block input
    z_t = F.tanh(self.alpha_z * self.weight_zx(inp) * self.weight_zh(h_0) +
                (self.beta_z1 * self.weight_zx(inp)) + (self.beta_z2 * self.weight_zh(h_0)))
    # current cell state
    cx = f_g * c_0 + i_g * z_t
    # hidden state
    hx = o_g * F.tanh(cx)

    out = self.decoder(hx.view(1,-1))

    return out, hx, cx

MI-LSTM vs LSTM

This quick experiment is based on the char-rnn from Andrej Karpathy’s blog post ‘The Unreasonable Effectiveness of Recurrent Neural Networks’. Most of the PyTorch code other than the mi-lstm was borrowed and adapted from the examples and tutorials created by Sean Robertson in his repo ‘Practical PyTorch’.

I took the list of death metal band names from the kaggle dataset. Then trained a MI-LSTM and LSTM with 1024 hidden units on the corpus ofband names. The network was trained with Adam with an initial learning rate of 0.0005 for 3 epochs. The learning rate was scaled down by 75% at the start of epoch 2 and 3.

loss curves

Training loss curves for LSTM and MI-LSTM.

Both LSTM styles convereged to similar loss values. This task may have been too simple to test the benefits of MI-LSTM.

Code here

References

  • Yuhuai Wu, Saizheng Zhang, Ying Zhang, Yoshua Bengio, Ruslan Salakhutdinov “On Multiplicative Integration with Recurrent Neural Networks”, link
  • Andrej Karpathy, “The Unreasonable Effectiveness of Recurrent Neural Networks”, link
  • Diederik Kingma, Jimmy Ba, “Adam: A Method for Stochastic Optimization”, link
  • Sean Robertson, “Practical PyTorch”, link
Written on May 31, 2017