Listen closely now…

One thing I’ve learned working on the Pithesiser is how bad a naive implementation of digital audio can sound.

You’d think that “CD quality” 44kHz stereo 16-bit digital audo reproduction would be a good base level to start from? Well it is, but there’s a whole host of little details to trip you up once you get going. And the more you crack on, the more sensitive your ear gets to the little niggling clicks and strange harmonics that can arise.

I’ve started to focus on sorting out some of these audio glitches with the Pithesiser, which is proving to be a great learning experience – and an ongoing one, as it will probably take a while to get to the bottom of them all. And now I have functioning examples of all the key synth components, there’s a host of “moving parts” that can mess with the sound unexpectedly.

First up, clipping. For the uninitiated, that’s when the amplitude of the digital signal exceeds the limits of the precision being used (16 bits here), and the tops and bottoms of waveforms get cut off flat. This creates nasty distortion. In most digital audio applications, clipping arises when you mix multiple sounds together and the sum of some (or indeed all) parts of their waveforms go out of range. By careful management of the relative levels of the sounds, you can avoid going out of range and clipping the sound – but that reduces the overall volume and means you make less use of the audio precision available for each sound. This does limit the quality you can achieve, but when lots of sounds are playing that’s not so noticeable as there is a lot audibly going on. Clipping can also arise when applying post-processing effects such as filtering (which can boost frequencies as well as cut them); but again careful “gain management” can sort this for you.

Next, clicking. Sudden changes in the waveform often introduce clicks or pops, both when increasing and when decreasing the level. On the Pithesiser I’ve found clicks arising in various ways – for example, if a sound has instantaneous attack or release (suddenly starts and stops) this very obviously can generate a click. I also have an automated volume adjustment when mixing multiple notes based on the number of notes currently playing (to help avoid clipping), but this can cause sudden volume changes as notes start and stop which often click. One approach to managing clicks is to make the changes smooth by interpolating the change of volume over a short period of time, or by applying a low pass filter to eliminate high frequencies – smoothing the volume changes by mathematically processing the waveform.

Finally, harmonics. Quite often you can hear strange quiet overtones on certain waveforms at certain pitches – the waveform doesn’t sound “pure” even if you’re playing just a simple sine wave. These are harmonics, and can arise from various sources. The inherent “stepped” or discrete nature of digital audio means that the wave is never truly smooth like it would be in analogue form. The stepped changes in amplitude can create these harmonics, particularly if they are large steps – and even if you have generated your waveform using the purest most simple form of the math behind it. Playing back a sampled waveform at a higher pitch than its original can make this harmonic distortion worse, as the difference in steps gets bigger. I’ve also been hearing a strange harmonic when a sound has a fast attack or release on its volume envelope.

Unwanted harmonics is the hardest one of the three to tackle. Again, smoothing out the sudden changes helps – you can apply a low pass filter, use a longer attack or release, or calculate interpolated values when playing at a higher pitch. If the waveform is sampled, you can generate a “band limited” version of the waveform mathematically (effectively pre-filtering the waveform). These all work, but do have an audible effect on the sound – as well as the undesired harmonics, some of the desirable higher frequencies get lost also.

Also, increasing accuracy helps – using a higher sample rate means less dramatic changes in value from sample to sample, or generating a sampled waveform at multiple pitches and selecting the one closest to the playback pitch. When interpolating between two sampled waveform values, using an interpolator mathematically “close” to the underlying waveform shape will also help.

The journey to good quality digital audio is long but ultimately rewarding. And you thought all those terms used on CD players in the past such as “8x oversampling” and bit counts higher than 16 were just marketing guff…

(c) 2013 Nicholas Tuckett