PRiSM JavaScript App Tutorial – Part II

2 September 2021

Dr Christopher Melen (PRiSM Software Research Engineer)

Welcome back

Welcome to the second in PRiSM’s series of tutorials on audio application development in modern JavaScript. In the first part we set up our development environment, and introduced the main technological components of the project, such as React and Parcel. In this part we will be looking at how to generate audio using JavaScript, specifically using the Web Audio API, as well as updating the app’s user interface with components for controlling audio output. This will involve an in-depth look at user interaction in React, and more generally at React application state management.

Generating Audio in JavaScript

Our application already has a graphical UI, albeit somewhat simplistic, and with nothing to do with audio. We’re building an app that will render audio waveforms to the page, but we also want to hear the audio itself, not simply draw the signal. So before we add any more sophisticated rendering code we need to build the ‘engine’ responsible for generating the audio signal itself. We don’t need to move outside JavaScript for this, however, as modern browsers come complete with a JavaScript API for generating and manipulating audio, the Web Audio API. This is a cross-platform API, implemented by every major web browser (not quite – the Safari browser does not implement it fully, and we do not recommend Safari for running the code in these tutorials). This means, conveniently, that we only have to write the code once and it will still run in Firefox as well as in Chrome, with little or no modification. For example, the code for generating a 440Hz sine wave with the Web Audio API consists of only a few lines:

// Create Web Audio API context
var audioCtx = new (window.AudioContext || window.webkitAudioContext)();

// Create Oscillator node
var oscillator = audioCtx.createOscillator();

oscillator.type = 'sine';
oscillator.frequency.setValueAtTime(440, audioCtx.currentTime);
oscillator.connect(audioCtx.destination);
oscillator.start();

A trivial change to the above code allows us to generate a different waveform, simply by changing the oscillator type:

oscillator.type = 'square';

We can use the API directly, but in order to make our code a bit more user-friendly we’ll create a separate oscillator ‘engine’ which abstracts away the raw API, whilst exposing the functionality we require. We will define our engine using the factory function pattern, as follows:

function Oscillator(type='sine', freq=440, gain=0.015) {

  const audioCtx = new (window.AudioContext || window.webkitAudioContext)();
  const osc = audioCtx.createOscillator();
  osc.type = type;
  osc.frequency.value = freq;

  const masterGain = audioCtx.createGain();
  const analyser = audioCtx.createAnalyser();

  osc.connect(analyser);
  analyser.connect(masterGain);

  masterGain.gain.value = gain;

  osc.start();

  let waveform = new Float32Array(analyser.frequencyBinCount);
  analyser.getFloatTimeDomainData(waveform);

  return {
    
    start: () => {
      masterGain.connect(audioCtx.destination);
    },

    stop: () => {
      masterGain.disconnect();
    },

    setFreq: (freq) => {
      osc.frequency.value = freq;
    },

    setOscType: (type) => {
      osc.type = type;
    },

    
    getWaveform: () => {
      analyser.getFloatTimeDomainData(waveform);
      return waveform;
    }

  };

}

export let osc = Oscillator();

The above code gives us the object which will act as the engine of our application, capable of producing different waveforms through a simple set of configuration options. Obviously we want to be able to generate not just a single sound, but many different types of waveforms, at different frequencies and timbres. This is what our little engine gives us, by exporting a single object which encapsulates the functionality we need from the Web Audio API.

At the heart of the engine is the AudioContext object, which provides the entry point for the entire API:

const audioCtx = new (window.AudioContext || window.webkitAudioContext)();

The Web Audio API was first implemented in Webkit, which is used by Safari and older versions of Chrome, so the above provides webkitAudioContext as an alternative to the standard AudioContext if the latter is not available. An application built with the Web Audio API consists of a graph of nodes, and we can call methods on the AudioContext object to create the different types of audio nodes we might need. The graph created by our engine consists of instances of three audio node types:

  • An OscillatorNode instance, for generating a periodic waveform (such as a sine wave).
  • An AnalyserNode instance, for processing real-time frequency and time-domain information.
  • A GainNode instance, for controlling the signal volume.

After chaining these nodes together we call the oscillator start() method to commence the audio signal at a specified time (undefined by default, meaning start immediately). This does not make the oscillator signal audible, however. To do this we need to connect our Gain node to the audioContext destination property, which represents the output device (the speakers):

masterGain.connect(audioCtx.destination);

The code at this point, however, is just creating and initialising the internals of the engine. Connecting the oscillator to the destination will be left as a job for the user – indeed the above line of code is actually part of the engine’s start() method, which will be triggered only by a user interaction (as might be expected there is a complementary stop() method, which disconnects the oscillator from the destination). We will come back to the reasons for doing things this way in the next section, on user interaction.

The engine also exposes a few additional methods which wrap other API methods, for example setFreq(), which allows the frequency of the generated signal to be modified, and getWaveform(), which provides access to the Float32 buffer which stores the waveform data in 32-bit floating point format. We will eventually make use of this data to render a waveform. Note that instead of exporting the Oscillator() factory function we export a variable osc storing an Oscillator()instance object. This makes it easy for us to access different audio engine methods at different levels of the app’s UI hierarchy, without having to pass the object down through the tree of components, which can be cumbersome (a ‘production’ app might instead make use of React’s Context API, which provides a way to make components at lower levels in the tree aware of higher-level data without having to pass it down through props).

User Interaction

It is generally not good manners to automatically play audio when a site loads – in fact Google Chrome will not allow it. Starting the oscillator should therefore be the responsibility of the user, so that is what we will implement (this is why our oscillator engine does not immediately connect to the destination output). Recall from the previous part in our tutorial series that React refreshes the DOM in response to changes in application state. We can also leverage these changes to cause side-effects, for example to call methods on external objects and APIs. Calling methods on our audio engine falls into exactly this category (we will see in the next part how we can also use side-effects to render things outside the normal flow of React component updates). Interactions by the user obviously need to cause these changes to occur somehow. React has a simple, elegant model for managing this, involving event handlers and callback functions. In order for the user to trigger any kind of event we will need a set of controls, which we can implement using standard HTML elements. At minimum we will require a control for switching the oscillator on and off, which we can implement with a standard HTML button element. But in a ‘real’ app of this sort as a user we might also expect to be able to modify properties of the waveform, such as its frequency, amplitude, or the type of waveform (sine, square, sawtooth, etc.). So we will also require controls for each of these types of interactions. For example, to change the oscillator type we could provide the user with a dropdown list of options, a very familiar design element in interactive applications. HTML provides a dropdown control for just this purpose, in the select element. Choosing a different option in this list will trigger an update to the application state, which in turn will result in an update to the rendered waveform.

Defining a component that renders a dropdown select in React is easy:

const OscillatorSelect = () => {
  return (
    <select>
      &lt;option value="sine">Sine</option>
      <option value="sawtooth">Sawtooth</option>
      <option value="square">Square</option>
      <option value="triangle">Triangle</option>
    </select>
  )
}

Now open src/app.js once again and replace the previous ‘Hello, World!’ code with the following, which defines the App component itself (there will be slight modifications to this code in the next tutorial, but it gives us what we need for now):

 
import React, { useState, useLayoutEffect, useRef } from "react";
import { osc } from "./oscillator";

// See https://www.robinwieruch.de/react-useeffect-only-on-update
const useLayoutEffectOnlyOnUpdate = (callback, dependencies) => {

  const didMount = React.useRef(false);
 
  React.useLayoutEffect(() => {
    if (didMount.current) {
      callback(dependencies);
    } else {
     didMount.current = true;
    }
  }, [callback, dependencies]);
};

const App = () => {
  const [oscType, setOscType] = useState('sine');
  const [freq, setFreq] = useState(440);
  const [hasAudio, setHasAudio] = useState(false);
  const handleChangeOscType = (e) => {
    setOscType(e.target.value);
  }
  const handleChangeFreq = (e) => {
    setFreq(e.target.value);
  }
  const handleHasAudio = () => {
    setHasAudio(!hasAudio);
  }

  useLayoutEffect(() => {
    osc.setOscType(oscType);
  }, [oscType]);

  useLayoutEffect(() => {
    osc.setFreq(freq);
  }, [freq]);

  useLayoutEffectOnlyOnUpdate(() => {
    hasAudio ? osc.start() : osc.stop();
  }, [hasAudio]);

  return (
    <div>
      <div id="toolbar">
        <label htmlFor="osc">Oscillator Type:</label>
        <select id="osc" value={ oscType } onChange={handleChangeOscType}>
          <option value="sine">Sine</option>
          <option value="sawtooth">Sawtooth</option>
          <option value="square">Square</option>
          <option value="triangle">Triangle</option>
        </select>
        <label htmlFor="freq">Frequency:</label>
        <input id="freq" type="number" value={freq} min={350} max={1050} onChange={handleChangeFreq}></input>
        <button id="start-stop" onClick={handleHasAudio}>{ hasAudio ? 'STOP' : 'START' }</button>
      </div>
    </div>
  )
}

export default App;

The App component’s JSX defines a set of HTML control elements, including a dropdown select for changing the oscillator type. In addition we have a number control, and a button element. These again are standard HTML elements, implemented by all browsers (with small variations in visual appearance). The number control allows an upper and lower boundary to be set for the possible frequency range, and can be controlled either with individual mouse clicks or continuously by using the keyboard up or down arrows. The button element requires no real explanation – just about every app will have one for some purpose, and in our case we have one to allow the user to start or stop the audio output.

The controls are placed inside a container div element, which acts as a simple toolbar. We will also need to add some CSS to define the layout for the controls. It is possible to add CSS directly inside JSX, through a style prop, but we will go the old-fashioned route and use external CSS, contained in a file loaded in the HTML head element. So open public/style.css and add the following lines:

canvas {
    display: block;
    margin: 15px;
}

#toolbar {
    width: 800px;
    margin-left: 15px;
    margin-right: 15px;
}

#freq, #osc {
    margin-left: 5px;
}

#osc {
    margin-right: 10px
}

#start-stop {
    float: right;
}

Managing Application State

We now have the basic UI structure of our app in place. But how do React control components detect interactions? Furthermore, how are these interactions propagated through the rest of the application? In HTML a select element will typically live within an enclosing form, and updates to form data are managed by the DOM itself (and presumably then communicated to the backend via AJAX handlers). In React, updates are instead managed by the component, using inline event handlers such as onChange(), onClick(), etc. In order to communicate these events to the rest of the application, and persist changes in such a way that our application can respond to them in a consistent way over time, we need a way of managing state. We briefly examined React’s state management model in Part I, in the context of the class component API. Although class components are not deprecated, React strongly encourages writing functional components whenever possible. But how do we accommodate the notion of ‘state’ when writing functional components? For this we need to look at the React Hooks API.

In class components state is managed using the inbuilt state object. Class components are instances, carrying their own mutable state around with them. Until hooks were introduced there was no parallel model of state for functional components. The Hooks API provides such a model via the useState() hook, and functional components can use this to manage state in an equivalent way. useState() returns a pair of values – the current state, and a function which can be used to update it:

const [freq, setFreq] = useState(440);

The state can only be accessed using this variable, and can only be modified using the function. Like their class equivalents, updates to state variables in functional components trigger the component to re-render. Using this simple but powerful API functional components can access and update state in a way entirely equivalent to the earlier class-based API.

In our App component code we have a group of three useState() entries:

const [oscType, setOscType] = useState('sine');
const [freq, setFreq] = useState(440);
const [hasAudio, setHasAudio] = useState(false);

Here we have one state variable oscType for the oscillator type, another variable freq for frequency, and a final, Boolean variable hasAudio which will determine whether we have audio output or not. Any value passed to useState() will be set as the initial value of the variable (above we are setting the initial oscillator type to be a sine wave, with its initial frequency set to 440, but with audio output switched off). Each variable is also automatically paired with a function which may be used to modify its value. Whatever we pass to this modifier function will set the variable to that new value (we can call this function anything we like, but by convention we use the set prefix, since it is a ‘setter’).

After creating and initialising our state variables, immediately below we define our event handlers, which are the functions that will fire in direct response to actions performed by the user:

const handleChangeOscType = (e) => {
  setOscType(e.target.value);
}

const handleChangeFreq = (e) => {
  setFreq(e.target.value);
}

Passing a function as the value of a component’s onChange prop means it will fire in response to the changeevent in the DOM. React is agnostic when it comes to what we put inside such a function, but clearly if its purpose is to handle events, the code we put inside it is probably going to modify something. In fact both these handlers simply make calls to the modifier function paired with one of the state variables. So in response to an action performed by the user – say, selecting a different oscillator type from the dropdown list – a change event is triggered, which in turn fires the handleChangeOscType() handler, which calls setOscType() in order to modify the oscType state variable.

The above example nicely encapsulates what React is all about: a cascade of component renders and DOM updates triggered by state changes (whether using class or functional components, the basic pattern is the same). But how do we connect these changes to our audio engine? The solution lies in another hook, useLayoutEffect(). This hook takes two arguments – a callback function, and an array of variables (or ‘dependencies’) which, when updated, fire the callback (if this array is empty, however, the callback is fired only once, after the initial render of the component). Since the callback will fire in response to updates in state variables, we can use it to run side-effects, after the main render (the same can be achieved in React class components through component lifecycle methods such as componentDidMount() and/or componentDidUpdate()). Here we employ useLayoutEffect() to call methods on our audio engine, in order to start or stop it, change the oscillator type, or modify its frequency. Note how each useLayoutEffect() entry responds to a specific state variable, included in the array passed as the second argument. For example, when the value of the freq state variable changes this triggers the engine’s setFreq() method to fire, modifying the frequency of the oscillator.

Notice that our hasAudio variable is paired with a slight variant of useLayoutEffect(), called useLayoutEffectOnlyOnUpdate(). In fact this is not an in-built React hook, but a custom hook (code courtesy of Robin Wieruch). Custom hooks provide a way for client code to extend the existing functionality of React hooks, or perhaps override some in-built hook functionality. For example, we’ve learnt that the useLayoutEffect() hook will fire its callback function whenever any of its dependency variables are updated. React regards the initial value to which a state variable is set to itself be an update, meaning that a useLayoutEffect() callback function will fire when the component is first loaded. But what if we want the callback to fire only when a ‘real’ update occurs, perhaps triggered by a user interaction? Indeed this is exactly our requirement – we need our callback to fire only after a user has clicked the audio start/stop button. This is what the useLayoutEffectOnlyOnUpdate() hook achieves (the name is rather cumbersome – in practice we’d want to use something less verbose!). Internally it includes useLayoutEffect(), but pairs this with a boolean variable didMount whose value is used to determine whether the component has already been loaded. This uses the useRef() hook, which we’ll describe in more detail in the next tutorial – very briefly this hook provides a way to manage references to mutable data and objects (and even DOM elements) within a functional component. It’s somewhat similar to useState() in the sense that it provides access to persistent data, although unlike useState() it does not trigger a re-render when its value is changed. This makes it very useful for more general forms of application ‘state’, as in our useLayoutEffectOnlyOnUpdate() hook, acting somewhat like an instance variable.

In the next tutorial we will be covering a more involved state management scenario, in which we use side-effects to render waveforms. We already have an example of how rendering components can depend on state changes, however, in the code for the start/stop button. This nicely illustrates a powerful feature of React called conditional rendering:

<button id="start-stop" onClick={handleHasAudio}>{ hasAudio ? 'STOP' : 'START' }</button>

We can write a component in the above way because React allows arbitrary JavaScript code to be embedded inside JSX. Notice that in place of its inner HTML, which would usually just contain text for the button’s label, or some additional HTML, we have a piece of JavaScript logic (a conditional ternary operator), enclosed in curly braces – the latter are necessary whenever we want to include JavaScript code inside JSX. If our hasAudio state variable is true then we render the text ‘STOP’, if not then we render the opposite instruction, ‘START’.

We have now reached the end of the second part in this series of tutorials on building an audio app with modern JavaScript. Thank you for following along this far. In this part we looked in some detail at the Web Audio API, and built our application’s audio engine using the API’s modular routing model. We followed this by looking at how to control and modify signals coming from the engine using standard HTML user interface elements, built using React functional components and the React Hooks API. In the third and final part we will look at how to take the audio signal and render it in a clean, efficient manner as a waveform, once again leveraging React’s functional component model and hooks.

Also in this section...