On Raspberry Pi performance

Warning: somewhat technical post – look away now if you don’t know what bit twiddling means (that’s a technical term by the way)…

Because of the tight timing constraints required of a real-time audio application, code performance is highly important. It has been a while since I did any profiling of the Pithesiser, so it’s about time to experiment on it a bit. New features such as filters, configurable controllers and high-resolution MIDI support do eat into the available performance budget after all.

My theme for this round of profiling was “float vs fixed point”. As I originally plumped for fixed point math code in the Pithesiser, I thought I should do something to justify that choice rather than going on gut feel. So I made some floating point implementations of the filtering, procedural sine oscillator and wavetable oscillator components and set up some timing tests around repeating these chunks of code numerous times.

It started out interesting, then got a little unexpected – which all adds up to the fact one should alway adhere to the golden rule of performance coding – “time everything, assume nothing”.

Float is faster than fixed point…

With my filter implementation, I achieved a three times speedup with the float version. Yup, scarily big – I wasn’t expecting quite that leap. So I took a detailed look at the code generated by the compiler for the fixed point version – it was massive, way larger than the floating point alternative. Lots of register setup and manipulation on each iteration of the sample processing loop, really dragging down performance.

I did some digging around in ARM documentation, considering if I could hand-roll my own assembler version – but ultimately decided that I didn’t want to yet spend the time needed to achieve that now. Because of the relatively high precision I was using (18 bit fractional part), a lot of the fixed point math had to be done using 64-bit values to avoid overflow – and that’s what I think was behind the gnarly amount of code. The Pi’s ARM is 32 bit, so has to use multiple instructions to achieve the 64 bit math (apart from multiplication).

Float continues to be faster than fixed point…

Then I moved on to the oscillator code, setting up float versions of the procedural sine and wavetable driven oscillators. Timing tests of these also showed the float procedural version to be faster, but the wavetable version not to be so. That turned out to be the conversion of the phase parameter into an integer index into the wavetable; when I replaced that part with a fixed point approach the float-based wavetable implementation was also faster. Looking good for floats so far – a further average gain of 15% to 20%.

Another upside of this float-based code is that it looks simpler and is easier to read than the equivalent fixed point code. Double win!

Doh! Sound hardware doesn’t do floating point.

Then I hit a wall – the USB sound card only does signed 16-bit integer samples. So if I switch the Pithesiser to floating point, there will have to be a final conversion stage – and judging by the float-to-int conversion issue from the wavetable, that could be a significant hit on performance. I whipped up a simple “scale-&-cast” float-to-int audio buffer conversion and timed it – lo and behold, it was expensive. About as expensive as my fixed point filter implementation. Swings and roundabouts, frying pans and fires.

The news gets worse for floats (in a way).

Next I considered optimising the float-to-int conversion. The generated code seemed ok, there were no function calls involved (as there can be on x86) so no easy wins there – it would have to be a bit-twiddling approach. But to ensure it worked, I would need to generate a float waveform to check the end results – so I took the float oscillator code from the earlier test and made that into its own module so I could reuse it. To be on the safe side, I  ran the performance tests again with the new module… but now the float implementation ran slower than the fixed point! Up to 55% slower.

It turned out that when the float oscillator code was in the test module, it was being inlined into the test loop rather than being called. And I was running a very large number of iterations, so the cost of doing the call (including caching penalties) was murdering the speed. The fixed point oscillator code was in the original Pithesiser oscillator module, so that wasn’t being inlined. That meant the original test wasn’t really fair.

So I levelled the playing field, dialled down the number of iterations and ran the oscillator tests again. This time there wasn’t much in it – the fixed point oscillators were on the whole a little faster.

The moral of the story

There’s that golden profiling rule of “time everything, assume nothing” – and for me, this experience bears that out. To which I would add that “context is everything” – you can create an artificial benchmark to prove that approach A is faster than approach B, but reality may ultimately make a nonsense of your results.

Getting hard figures that I could use to plan my next steps really helped. As it would appear that there wasn’t much to chose between float or fixed for oscillators, it came down to whether I could accept the float-to-int conversion hit and have faster float filters, or avoid the conversion but have slower fixed point filters.

But maybe I don’t have to! With some tweaking and tuning, I got the fixed point filters to be about 10% quicker than the floating point versions by dialling down the precision to 14 bits and consequently avoiding a lot of the 64-bit math without hurting quality.

Consequently I’m going to go forward for now with fixed point, reasonably confident that there’s enough performance there to increase the workload of the synth without getting into trouble… yet.

(c) 2013 Nicholas Tuckett

Advertisements

These go to sixteen thousand, three hundred and eighty three…

 

The Pithesiser just got that little bit shinier – I improved the audio quality in three areas with a small purchase and just two changes to the code…

A New Toy

image

That’s a Behringer BCR2000 programmable MIDI controller, which is taking over from the Korg nanoKONTROL. Using it means I can have a lot more controls readily accessible, and I can take advantage of high precision MIDI controllers. The Korg is limited to MIDI controllers that only go from 0 to 127 – the BCR2000 controls can be configured to go from 0 to up to 16,383.

How does that improve audio quality? By allowing much finer control where needed. For example, take the master volume for the synth. With only 128 distinct volume settings, you can hear the stepped nature of the changes – often as clicks depending on the sound playing. By using a controller with a range of 0 to 4095 instead, that makes it fine enough that you can’t hear the steps and don’t get any clicks.

Having to use a new controller made me face the necessity of introducing a level of configurability into the Pithesiser – now all the controller parameters are set up in a config file read in when the synth is started up. Much easier to use!

I must admit to being a little concerned about the fact this is Behringer kit – past experience with some of their hardware has taught me that you definitely get what you pay for… and their stuff is relatively cheap. However I’m impressed by what this device delivers for its price – the bulk of the case is metal, the buttons and majority of the pots feel really solid. Only the top row of pots feel a bit wobbly; they’re also push buttons which might explain why. It also turns out Daft Punk have used them live (http://digitaldj.wordpress.com/2008/07/07/inside-the-pyramid-daft-punks-live-gear/) so they must be able to take some of the rough and tumble of live setups.

Smooth Operator

In an attempt to rid the Pithesiser of clicks from the automatic volume adjustment (a form of “auto-ducking”) code, I introduced smooth interpolation of volume changes across each chunk of processed audio. As I was doing this, it occurred to me that I could also utilise this to see if it would eliminate the strange harmonics arising from sounds with fast attack and/or release volume envelopes.

This all led to a simpler implementation than the original code, where the envelope and auto-ducking code work together to calculate what the volume should be at the end of a given audio chunk then pass that on to the actual synthesis code which smoothly interpolates the volume at the end of the last chunk towards the new value.

And guess what? It resolved both of the original audio quality issues!

(c) 2013 Nicholas Tuckett

 

 

Listen closely now…

One thing I’ve learned working on the Pithesiser is how bad a naive implementation of digital audio can sound.

You’d think that “CD quality” 44kHz stereo 16-bit digital audo reproduction would be a good base level to start from? Well it is, but there’s a whole host of little details to trip you up once you get going. And the more you crack on, the more sensitive your ear gets to the little niggling clicks and strange harmonics that can arise.

I’ve started to focus on sorting out some of these audio glitches with the Pithesiser, which is proving to be a great learning experience – and an ongoing one, as it will probably take a while to get to the bottom of them all. And now I have functioning examples of all the key synth components, there’s a host of “moving parts” that can mess with the sound unexpectedly.

First up, clipping. For the uninitiated, that’s when the amplitude of the digital signal exceeds the limits of the precision being used (16 bits here), and the tops and bottoms of waveforms get cut off flat. This creates nasty distortion. In most digital audio applications, clipping arises when you mix multiple sounds together and the sum of some (or indeed all) parts of their waveforms go out of range. By careful management of the relative levels of the sounds, you can avoid going out of range and clipping the sound – but that reduces the overall volume and means you make less use of the audio precision available for each sound. This does limit the quality you can achieve, but when lots of sounds are playing that’s not so noticeable as there is a lot audibly going on. Clipping can also arise when applying post-processing effects such as filtering (which can boost frequencies as well as cut them); but again careful “gain management” can sort this for you.

Next, clicking. Sudden changes in the waveform often introduce clicks or pops, both when increasing and when decreasing the level. On the Pithesiser I’ve found clicks arising in various ways – for example, if a sound has instantaneous attack or release (suddenly starts and stops) this very obviously can generate a click. I also have an automated volume adjustment when mixing multiple notes based on the number of notes currently playing (to help avoid clipping), but this can cause sudden volume changes as notes start and stop which often click. One approach to managing clicks is to make the changes smooth by interpolating the change of volume over a short period of time, or by applying a low pass filter to eliminate high frequencies – smoothing the volume changes by mathematically processing the waveform.

Finally, harmonics. Quite often you can hear strange quiet overtones on certain waveforms at certain pitches – the waveform doesn’t sound “pure” even if you’re playing just a simple sine wave. These are harmonics, and can arise from various sources. The inherent “stepped” or discrete nature of digital audio means that the wave is never truly smooth like it would be in analogue form. The stepped changes in amplitude can create these harmonics, particularly if they are large steps – and even if you have generated your waveform using the purest most simple form of the math behind it. Playing back a sampled waveform at a higher pitch than its original can make this harmonic distortion worse, as the difference in steps gets bigger. I’ve also been hearing a strange harmonic when a sound has a fast attack or release on its volume envelope.

Unwanted harmonics is the hardest one of the three to tackle. Again, smoothing out the sudden changes helps – you can apply a low pass filter, use a longer attack or release, or calculate interpolated values when playing at a higher pitch. If the waveform is sampled, you can generate a “band limited” version of the waveform mathematically (effectively pre-filtering the waveform). These all work, but do have an audible effect on the sound – as well as the undesired harmonics, some of the desirable higher frequencies get lost also.

Also, increasing accuracy helps – using a higher sample rate means less dramatic changes in value from sample to sample, or generating a sampled waveform at multiple pitches and selecting the one closest to the playback pitch. When interpolating between two sampled waveform values, using an interpolator mathematically “close” to the underlying waveform shape will also help.

The journey to good quality digital audio is long but ultimately rewarding. And you thought all those terms used on CD players in the past such as “8x oversampling” and bit counts higher than 16 were just marketing guff…

(c) 2013 Nicholas Tuckett

Opening the “Can of Filter worms”…

Well, wadda ya know? I got basic filtering to work!

There’s a lot of great information out on the web about digital signal processing, so you would think that might make it easy. However there’s almost too much; you need some basic prior knowledge to filter the information before you can filter your signals.

That said, most basic filters boil down to some fairly straightforward implementations involving no math more complex than trigonometry, exponentials and algebra – and that’s only needed for calculating the coefficients, the actual filtering operation on sample data is just multiplies and adds. But I did have to go and make my life a little more difficult by insisting on using fixed point, so have to contend with avoiding overflows and correcting precision here and there.

So, I have a two-pole filter that can be switched between low and high pass modes, with base frequency and Q controls. Thanks to the snappily named “Cookbook formulae for audio EQ biquad filter coefficients”, it should be easily extended to more filter types (see http://www.musicdsp.org/files/Audio-EQ-Cookbook.txt). But first there’s some quality issues to address; clicks when changing filter parameters and some instability leading to unpleasant distortion when the high-pass filter is set low and low frequencies played. The built-in oscilloscope display has come in very useful!

The filter is also being applied globally to the synth’s output, and is currently “all or nothing”. So there’s plenty of more stuff to try, such as adding a gain control, driving filter parameters from envelopes or applying a filter for each voice.

(c) 2013 Nicholas Tuckett.