PRiSM SampleRNN

What is PRiSM SampleRNN?

PRiSM SampleRNN is a project including development of the code prism-samplernn, an open-access neural audio synthesis software tool released on GitHub in June 2020 as part of PRiSM Future Music #2. Development of the software is funded by Research England, Expanding Excellence in England (E3).

The software is PRiSM’s first major contribution to the field of Machine Learning. Built upon the SampleRNN model which addresses unconditional audio generation in the raw acoustic domain, it is able to generate new audio outputs by ‘learning’ the characteristics of any existing corpus of sound or music. Artists are invited to collate their own datasets (or rather, sound libraries), curated for their unique creative contexts. Changing how these datasets are organised, as well as the parameters of the algorithm during the generation processes significantly change the resultant output – making these choices part of the creative process. The audio generated can be used directly in a composition or to inform notated work to be played by an instrumentalist.

The software was developed in response to work by Dr Sam Salem (PRiSM Senior Lecturer in Composition). For his piece Midlands (2019), Salem made field recordings whilst walking 120km of the River Derwent. These materials were used to synthesise new sounds with Wavenet, one of the earlier deep-learning algorithms for audio generation, but the speed of the workflow made it difficult to explore the full possibilities of the technique (documented in the PRiSM blog post A Psychogeography of Latent Space). An alternative, SampleRNN, represented an opportunity to generate sound more quickly but the code was broken. Dr Christopher Melen, PRiSM Research Software Engineer (2019-2023), undertook a complete reimplementation of the original SampleRNN code¹, fixing broken dependencies and upgrading it to work with the latest versions of Python and Tensorflow. It constitutes a completely new implementation of the SampleRNN architecture (documented in the PRiSM blog post A Short History of Neural Synthesis).

The release of PRiSM SampleRNN was accompanied by a model developed by Salem using data from the RNCM’s world-class archive of choral music. Since then, it has developed into one of PRiSM’s major projects, bringing together practitioners across a diverse range of disciplines and fields of study, and illustrating PRiSM’s core research concerns of collaborative and interdisciplinary effort. It is currently being used in projects by composers, musicians and technologists across the globe. A free and open-source project, the latest release can be downloaded from the software development platform GitHub. The software is readily available (open source), and compatible with most main-stream computational systems (including devices with Apple’s Silicon chips), and the technique has been made explicit through a number of online resources and performances demonstrating this creative application of AI; informing technology researchers, other arts practitioners, educators and the general public.

Notes

1 The original SampleRNN architecture was described in the paper SampleRNN: An Unconditional End-to-End Neural Audio Generation Model (2017). This version was based on Python 2 and Theano, a library for performing fast computations involving matrices. This version formed the basis for the Dadabots’ famous Relentless Doppelganger livestream on Youtube (https://www.youtube.com/watch?v=MwtVkPKx3RA). Since both Python 2 and Theano are officially deprecated it was decided that PRiSM would offer its own implementation, based on Google’s popular Machine Learning library TensorFlow 2.

Resources

prism-samplernn code Github 

prism-samplernn Google Colab Notebook

Citation

prism-samplernn is open source and free to use under an MIT licence, copyright retained by RNCM PRiSM. We would ask that the software and funding is referenced in work as follows:

prism-samplernn
Led by Sam Salem and Christopher Melen
Initiated by Sam Salem
prism-samplernn code by Christopher Melen

A PRiSM Collaboration also involving David De Roure, Marcus du Sautoy, and Emily Howard.

The RNCM Centre for Practice & Research in Science & Music (PRiSM) is funded by the Research England fund Expanding Excellence in England (E3).

Relevant Projects and PRiSM Blogs