Analysis and Generation of Rap Lyrics

DateOctober 2018 - May 2019
RoleSolo Project
DescriptionThis project explores techniques used to artificially generate rap lyrics and presents a new technique for lyric generation centred around iterative improvements upon random text, akin to a hill-climbing algorithm. 21 metrics were implemented in a Python tool for songwriters to analyse their rap lyrics, including a new algorithm for rhyme detection. Random text is generated via a Markov chain-based approach using lyrics taken from Genius and classified as rap lyrics with 96% accuracy. The metrics are then used to suggest improvements to the lyrics, bringing each metric closer to the means found in existing lyrics. The resulting software is able to generate lyrics with a higher rhyme density than existing Markov chain and word substitution lyric generation models.
Analysis and Generation of Rap Lyrics



What was the result of your work?
Overall, my new method of generating lyrics appeared more similar to existing rap lyrics than the other three methods studied in the dissertation (based upon the created 'rap likeness' classifier). In terms of the rhyme factor feature, the created system was able to outperform 15 of the 98 professional rappers used in the study - whereas two of the three existing systems would not beat any of these rappers. The dissertation also received 80% at the end of the year.
What was the motivation for this project?
During the third year of university we all have to do a large solo project. I wanted to combine one of my interests or hobbies with an exciting area of Computer Science, as I believe this is possible for most interests if you take the right approach. While looking for inspirations I cam across the work of Eric Malmi, and his algorithm for detecting rhymes. From there I got lots of ideas for natural language processing, and ultimately came up with my system for generating rap lyrics. This is how "Analysis and Generation of Rap Lyrics" was born.
How does the generation system work?
The system begins with initial random words or lyrics (specifically a single line generated via Markov Chains trained on existing lyrics). From there the system will analyse the lyrics based upon several features, and select a feature to improve that will make the lyrics appear more like existing lyrics (determined by the output of a random forest classifier). The system will then improve that feature in the lyrics. This cycle of analysis-improvement is repeated until the lyrics are satisfactory (the likeness threshold is reached) or enough iterations have passed.
Are there any disadvantages to using your system for generating lyrics?
Obviously in terms of rhyming this is not the best existing system, but it does make up for it by structuring the lyrics similar to existing human-written lyrics. A potentially big disadvantage of the system is that it will adapt to the structure (and take the vocabulary) of the lyrics that the system is trained on. So if the quality of the training lyrics is low, then the final output will likely be low quality too. Similarly, if only 80s rap is used to train the system then the system will produce lyrics similar to 80s rap. This can be beneficial if the user wants to produce lyrics of a certain style, but in general it would be important to use a wide variety of lyrics to train the system.
What is the most important thing you learnt from this project?
I did not struggle with project management of the overall project but there was a specific time where I fell behind on work and needed to adapt the plan to get back on track. Although I put in some mitigations to help, I still learnt a lot about juggling tasks. Not everything will be completed in the planned time, particularly when you also need to learn new information about tools or domain-specific knowledge. In a project like this, extra planning or research near the beginning will save more time during development.
What metrics can the analysis system use to quantify rap lyrics?
There is a large variety of metrics, including:
  • Rhyme density
  • Profanity count
  • Line syllable similarity
  • Reading age