PRiSM SampleRNN

What is PRiSM SampleRNN?

PRiSM SampleRNN is a project including development of the code prism-samplernn, an open-access neural audio synthesis software tool released on GitHub in June 2020 as part of PRiSM Future Music #2. Development of the software is funded by Research England, Expanding Excellence in England (E3).

The software is PRiSM’s first major contribution to the field of Machine Learning. Built upon the SampleRNN model which addresses unconditional audio generation in the raw acoustic domain, it is able to generate new audio outputs by ‘learning’ the characteristics of any existing corpus of sound or music. Artists are invited to collate their own datasets (or rather, sound libraries), curated for their unique creative contexts. Changing how these datasets are organised, as well as the parameters of the algorithm during the generation processes significantly change the resultant output – making these choices part of the creative process. The audio generated can be used directly in a composition or to inform notated work to be played by an instrumentalist.

The software was developed in response to work by Dr Sam Salem (PRiSM Senior Lecturer in Composition). For his piece Midlands (2019), Salem made field recordings whilst walking 120km of the River Derwent. These materials were used to synthesise new sounds with Wavenet, one of the earlier deep-learning algorithms for audio generation, but the speed of the workflow made it difficult to explore the full possibilities of the technique (documented in the PRiSM blog post A Psychogeography of Latent Space). An alternative, SampleRNN, represented an opportunity to generate sound more quickly but the code was broken. Dr Christopher Melen, PRiSM Research Software Engineer (2019-2023), undertook a complete reimplementation of the original SampleRNN code¹, fixing broken dependencies and upgrading it to work with the latest versions of Python and Tensorflow. It constitutes a completely new implementation of the SampleRNN architecture (documented in the PRiSM blog post A Short History of Neural Synthesis).

The release of PRiSM SampleRNN was accompanied by a model developed by Salem using data from the RNCM’s world-class archive of choral music. Since then, it has developed into one of PRiSM’s major projects, bringing together practitioners across a diverse range of disciplines and fields of study, and illustrating PRiSM’s core research concerns of collaborative and interdisciplinary effort. It is currently being used in projects by composers, musicians and technologists across the globe. A free and open-source project, the latest release can be downloaded from the software development platform GitHub. The software is readily available (open source), and compatible with most main-stream computational systems (including devices with Apple’s Silicon chips), and the technique has been made explicit through a number of online resources and performances demonstrating this creative application of AI; informing technology researchers, other arts practitioners, educators and the general public.

Notes

1 The original SampleRNN architecture was described in the paper SampleRNN: An Unconditional End-to-End Neural Audio Generation Model (2017). This version was based on Python 2 and Theano, a library for performing fast computations involving matrices. This version formed the basis for the Dadabots’ famous Relentless Doppelganger livestream on Youtube (https://www.youtube.com/watch?v=MwtVkPKx3RA). Since both Python 2 and Theano are officially deprecated it was decided that PRiSM would offer its own implementation, based on Google’s popular Machine Learning library TensorFlow 2.

Resources

prism-samplernn code Github

prism-samplernn Google Colab Notebook

Citation

prism-samplernn is open source and free to use under an MIT licence, copyright retained by RNCM PRiSM. We would ask that the software and funding is referenced in work as follows:

prism-samplernn
Led by Sam Salem and Christopher Melen
Initiated by Sam Salem
prism-samplernn code by Christopher Melen

A PRiSM Collaboration also involving David De Roure, Marcus du Sautoy, and Emily Howard.

This work is supported by PRiSM, the Centre for Practice & Research in Science & Music at the Royal Northern College of Music, funded by the Research England fund Expanding Excellence in England (E3).

PRiSM SampleRNN Case Study

Co-led by PRiSM artist-researchers Dr Bofan Ma and Dr Ellen Sargen, the Case Study looked into the stories behind the development and roll-out of PRiSM SampleRNN since its launch in 2020. The study finalised in 2024, focussing upon a selection of creative projects undertaken by artists closely involved in the iterative design and refinement of the resultant software tool at varying stages. Specifically, the team sought to understand these artists’ chosen methods to engage PRiSM SampleRNN (contextualised into their various pre-existing knowledge of programming and neural synthesis); their interaction with the Research Software Engineer and the impact this had on their creative process. Taking place in a conservatoire, this project additionally sought to understand how new models of shared learning could engender new creative and teaching strategies in the age of machine learning and AI.

A summary report of the study was published at the 2024 International Conference on AI and Musical Creativity in September 2024. The paper offers a highly positional point of view of collaborative dialogues between artists and computer science specialists, inviting practitioners, technologists, and institutions to collectively prioritise human-centred interdisciplinary approaches to AI research, as we progress further into a technologically mediated musical future.

Open access report Learning to Learn – A Reflexive Case Study of PRiSM SampleRNN