How to re-sample an audio signal

As I mentioned earlier, I would like to have the flexibility of using digital audio data that emanates externally from the PC that is performing the DSP, and this necessarily will have a different sample clock from the DAC. Something has got to give!

If the input was analogue, you would just sample it with an ADC locked to your DAC’s sample rate, and then the source’s own sample rate wouldn’t matter to you. With a standard digital audio source (e.g. S/PDIF) you need to be able to do the same thing but purely in software. The incoming sampled data points are notionally turned into a continuous waveform in memory by duplicating a DAC reconstruction filter using floating point maths. You can then sample it wherever you want at a rate locked to the DAC’s sample rate.

You still ‘eat’ the incoming data at the rate at which it comes in, but you vary the number of samples that you ‘decimate’ from it (very, very slightly).

The control algorithm for locking this re-sampling to the DAC’s sample rate is not completely trivial because the PC’s only knowledge of the sample rates of the DAC and S/PDIF are via notifications that large chunks of data have arrived or left, with unknown amounts of jitter. It is only possible to establish an accurate measure of relative sample rates with a very long time constant average. In reality the program never actually calculates the sample rate at all, but merely maintains a constant-ish difference between the read and write pointer positions of a circular buffer. It relies on adequate latency and the two sample rates being reasonably stable by virtue of being derived from crystal oscillators. The corrections will, in practice be tiny and/or occasional.

How is the interesting problem of re-sampling solved?

Well, it’s pretty new to me, so in order to experiment with it I have created a program that runs on a PC and does the following:

  1. Synthesises a test signal as an array of floating point values at a notional sample rate of 44.1 kHz. This can be a sine wave, or combination of different frequency sine waves.
  2. Plots the incoming waveform as time domain dots.
  3. Plots the waveform as it would appear when reconstructed with the sinc filter. This is a sanity check that the filter is doing approximately the right thing.
  4. Resamples the data at a different sample rate (can be specified with any arbitrary step size e.g. 0.9992 or 1.033 or whatever), using floating point maths. The method can be nearest-neighbour, linear interpolation, or sinc & linear interpolation.
  5. Plots the resampled waveform as time domain dots.
  6. Passes the result into a FFT (65536 points), windowing the data with a raised cosine window.
  7. Plots the resulting resampled spectrum in terms of frequency and amplitude in dB.

This is an ideal test bed for experimenting with different algorithms and getting a feel for how accurate they are.

Nearest-neighbour and linear interpolation are pretty self explanatory methods; the sinc method is similar to that described here:

I haven’t completely reproduced (or necessarily understood) their method, but I was inspired by this image:


The sinc function is the ideal ‘brick wall’ low pass filter and is calculated as sin(x*PI)/(x*PI). In theory it extends from minus to plus infinity, but for practical uses is windowed so that it tapers to zero at plus or minus the desired width – which should be as wide as practical.

The filter can be set at a lower cutoff frequency than Nyquist by stretching it out horizontally, and this would be necessary to avoid aliasing if wishing to re-sample at an effectively slower sample rate.

If the kernel is slid along the incoming sample points and a point-by-point multiply and sum is performed, the result is the reconstructed waveform. What the above diagram shows is that the kernel can be in the form of discrete sampled points, calculated as the values they would be if the kernel was centred at any arbitrary point.

So resampling is very easy: simply synthesise a sinc kernel in the form of sampled points based on the non-integer position you want to reconstruct, and multiply-and-add all the points corresponding to it.

A complication is the necessity to shorten the filter to a practical length, which involves windowing the filter i.e. multiplying it by a smooth function that tapers to zero at the edges. I did previously mention the Lanczos kernel which apparently uses a widened copy of the central lobe of the sinc function as the window. But looking at it, I don’t know why this is supposed to be a good window function because it doesn’t taper gradually to zero, and at non-integer sample positions you would either have to extend it with zeroes abruptly, or accept non-zero values at its edges.

Instead, I have decided to use a simple raised cosine as the windowing function, and to reduce its width slightly to give me some leeway in the kernel’s position between input samples. At the extremities I ensure it is set to zero. It seems to give a purer output than my version of the Lanczos kernel.

Pre-calculating the kernel

Although very simple, calculating the kernel on-the-fly at every new position would be extremely costly in terms of computing power, so the obvious solution is to use lookup tables. The pre-calculated kernels on either side of the desired sample position are evaluated to give two output values. Linear interpolation can then be used to find the value at the exact position. Because memory is plentiful in PCs, there is no need to skimp on the number of pre-calculated kernels – you could use a thousand of them. For this reason, the errors associated with this linear interpolation can be reduced to negligible.

The horizontal position of the raised cosine window follows the position of the centre of the kernel for all the versions that are calculated to lie in between the incoming sample points.

All that remains is to decide how wide the kernel needs to be for adequate accuracy in the reconstruction – and this is where my demo program comes in. I apologise that there now follows a whole load of similar looking graphs, demonstrating the results with various signals and kernel sizes, etc.

1 kHz sine wave

First we can look at the standard test signal: a 1 kHz sine wave. In the following image, the original sine wave points are shown joined with straight lines at the top right, followed by how the points would look when emerging from a DAC that has a sinc-based reconstruction filter (in this case, the two images look very similar).

Next down in the three time domain waveforms comes the resampled waveform after we have resampled it to shift its frequency by a factor of 0.9 (a much larger ratio than we will use in practice). In this first example, the resampling method being used is ‘nearest neighbour’. As you can see, the results are disastrous!


1kHz sine wave, frequency shift 0.9, nearest neighbour interpolation

The discrete steps in the output waveform are obvious, and the FFT shows huge spikes of distortion.

Linear interpolation is quite a bit better in terms of the FFT, and the time domain waveform at the bottom right looks much better.


1kHz sine wave, frequency shift 0.9, linear interpolation

However, the FFT magnitude display reveals that it is clearly not ‘hi-fi’.

Now, compare the results using sinc interpolation:


1kHz sine wave, frequency shift 0.9, sinc interpolation, kernel width 50

As you can see, the FFT plot is absolutely clean, indicating that this result is close to distortion-free.

Next we can look at something very different: a 20 kHz sine wave.

20 kHz sine wave


20 Khz sine wave, frequency shift 0.9, nearest neighbour interpolation

With nearest neighbour resampling, the results are again disastrous. At the right hand side, though, the middle of the three time domain plots shows something very interesting: even though the discrete points look nothing like a sine wave at this frequency, the reconstruction filter ‘rings’ in between the points, producing a perfect sine wave with absolutely uniform amplitude. This is what is produced by any normal DAC – and is something that most people don’t realise; they often assume that digital audio falls apart at the top end, but it doesn’t: it is perfect.

Linear interpolation is better than nearest-neighbour, but pretty much useless for our purposes.


20kHz sine wave, frequency shift 0.9, linear interpolation

Sinc interpolation is much better!


20kHz sine wave, frequency shift 0.9, sinc interpolation, kernel size 50

However, there is an unwanted spike at the right hand side (note the main signal is at 18 kHz because it has been shifted down by a factor of 0.9). This spike appears because of inadequate width of the sinc kernel which in this case has been set at 50 (with 500 pre-calculated versions of it with different time offsets, between sample points).

If we increase the width of the kernel to 200 (actually 201 because the kernel is always symmetrical about a central point with value 1.0), we get this:


20kHz sine wave, frequency shift 0.9, sinc interpolation, kernel size 200

The spike is almost at acceptable levels. Increasing the width to 250 we get this:


20 kHz sine wave, frequency shift 0.9, sinc interpolation, kernel size 250

And at 300 we get this:


20 kHz sine wave, frequency shift 0.9, sinc interpolation, kernel size 300

Clearly the kernel width does need to be in this region for the highest quality.

For completeness, here is the system working on a more complex waveform comprising the sum of three frequencies: 14, 18 and 19 kHz, all at the same amplitude and a frequency shift of 1.01.

14 kHz, 18 kHz, 19 kHz sum

Nearest neighbour:


14, 18, 19 kHz sine wave, nearest neighbour interpolation

Linear interpolation:

14, 18, 19 kHz sine wave, linear interpolation

Sinc interpolation with a kernel width of 50:


14, 18, 19 kHz sine wave, sinc interpolation, kernel width 50

Kernel width increased to 250:

14, 18, 19 kHz sine wave, sinc interpolation, kernel width 250

More evidence that the kernel width needs to be in this region.

Ready made solutions

Re-sampling is often done in dedicated hardware like Analog Devices’ AD1896. Some advanced sound cards like the Creative X-Fi can re-sample everything internally to a common sample rate using powerful dedicated processors – this is the solution that makes connecting digital audio sources together almost as simple as analogue.

In theory, stuff like this goes on inside Linux already, in systems like JACK – apparently. But it just feels too fragile: I don’t know how to make sure it is working, and I don’t really have any handle on the quality of it. This is a tricky problem to solve by trial-and-error because a system can run for ages without any sign that clocks are drifting.

In Windows, there is a product called “Virtual Audio Cable” that I know performs re-sampling using methods along these lines.

There are libraries around that supposedly can do resampling, but the quality is unknown – I was looking at one that said “Not the best quality” so I gave up on that one.

I have a feeling that much of the code was developed at a time when processors were much less powerful than they are now and so the algorithms are designed for economy rather than quality.

Software-based sinc resampling in practice

I have grafted the code from my demo program into my active crossover application and set it running with TOSLink from a CD player going into a cheap USB sound card (Maplin), and the output going to a better multichannel sound card (the Xonar U7). The TOSLink data is being resampled in order to keep it aligned with the DAC’s sample rate. I have had it running for 20 hours without incident.

Originally, before developing the test bed program, I set the kernel size at 50, fearing that anything larger would stress the Intel Atom CPU. However, I now realise that a width of at least 250 is necessary, so with trepidation I upped it to this value. The CPU load trace went up a bit in the Ubuntu system monitor, but not much; the cores are still running cool. The power of modern CPUs is ridiculous!! Remember that for each of the two samples arriving at 44.1 kHz, the algorithm is performing 500 floating point multiplications and sums, yet it hardly breaks into a sweat. There are absolutely no clever efficiencies in the programming. Amazing.


Active crossover with Raspberry Pi?

I was a bit bored this afternoon and finally managed to put myself into the frame of mind to try transplanting my active crossover software onto a Raspberry Pi.

It turns out it works, but it’s a bit delicate: although CPU usage seems to be about 30% on average, extra activity on the RPi can cause glitches in the audio. But I have established in principle that the RPi can do it, and that the software can simply be transplanted from a PC to the RPi – quite an improbable result I think!

A future-proof DSP box?

What I’d like to do is: build a box that can implement my DSP ‘formula’, that isn’t connected to the internet, takes in stereo S/PDIF, and gives out six channels of analogue.

Is this the way to get a future-proof DSP box that the Powers-That-Be can’t continually ‘upgrade’ into obsolescence? In other words, I would always be able to connect the latest PCs, streamers, Chromecast to it without relying on the same box having to be the source of the stereo audio itself (which currently means that every time it is booted it up it could stop working because of some trivial – or major – change that breaks the system). Witness only this week where Spotify has ‘upgraded’ its system and consigned many dedicated smart speakers’ streaming capability into oblivion. The only way to keep up with such changes is to be an IT-support person, staying current with updates and potentially making changes to code.

To avoid this, surely there will always have to be cheap boxes that connect to the internet and give out S/PDIF or TOSLink, maintained by genuine IT-support people, rather than me having to do it. (Maybe not…. It’s possible that if fitment of MQA-capable chips becomes universal in all future consumer audio hardware, they could eventually decide it is viable to enable full data encryption and/or restrict access to unencrypted data to secure, licensed hardware only).

It’s unfortunate, because it automatically means an extra layer of resampling in the system (because the DAC’s clock is not the same as the source’s clock), but I can persuade myself that it’s transparent. If the worst comes to the very worst in future, the box could also have analogue inputs, but I hope it doesn’t come to that.

This afternoon’s exercise was really just to see if it could be done with an even cheaper box than a fanless PC and, amazingly, it can! I don’t know if anyone else out there is like me, but while I understand the guts of something like DSP, it’s the peripheral stuff I am very hazy on. To me, to be able to take a system that runs on an Intel-based PC and make it run on a completely different processor and chipset without major changes is so unlikely that I find the whole thing quite pleasing.

[UPDATE 18/02/18] This may not be as straightforward as I thought. I have bought one of these for its S/PDIF input (TOSLink, actually). This works (being driven by a 30-year old CD player for testing), but it has focused my mind on the problem of sample clock drift:

My own resampling algorithm?

S/PDIF runs at the sender’s own rate, and my DAC will run at a slightly different rate. It is a very specialised thing to be able to reconcile the two, and I am no longer convinced that Linux/Alsa has a ready-made solution. I am feeling my way towards implementing my own resampling algorithm..!

At the moment, I regulate the sample rate of a dummy loopback driver that draws data from any music player app running on the Linux PC. Instead of this, I will need to read data in at the S/PDIF sample rate and store it in the circular buffer I currently use. The same mechanism that regulates the rate of the loopback driver will now control the rate at which data is drawn from this circular buffer for processing, and the values will need to be interpolated in between the stored values using convolution with a windowed sinc kernel. It’s an horrendous amount of calculation that the CPU will have to do for each and every output sample – probably way beyond the capabilities of the Raspberry Pi I’m afraid. This problem is solved in some sound cards by using dedicated hardware to do resampling, but if I want to make a general purpose solution to the problem, I will need to bite the bullet and try to do it in software. Hopefully my Intel Atom-based PC will be up to the job. It’s a good job that I know that high res doesn’t sound any different to 16/44.1 otherwise I could be setting myself up for needing a supercomputer.

[UPDATE 20/02/18] I couldn’t resist doing some tests and trials with my own resampling code.

Resampling Experiments

First, to get a feel for the problem and how much computing power it will take, I tried running some basic multiplies and adds on a Windows laptop programmed in ‘C’. If using a small filter kernel size of 51 and assuming two sweeps of two pre-calculated kernels per sample (then a trivial interpolation between), it could only just keep up with stereo CD in real time. Disappointing, and a problem if the PC is having to do other stuff. But then I realised that the compiler had all optimisations turned off. Optimising for maximum speed, it was blistering! At least 20x real time.

I tried on a Raspberry Pi. Even it could keep up at 3x real time.

There may be other tricks to try as well, including processor-specific optimisations and programming for ‘SIMD’ (apparently where the CPU does identical calculations on vectors i.e. arrays of values, simultaneously) or kicking off threads to work on parts of the calculation where the operating system is able to share the tasks optimally across the processor cores. Or maybe that’s what the optimisation is doing, anyway.

There is also the possibility that for a larger (higher quality) kernel (say >256 values), an FFT might be a more economical way of doing the convolution.

Either way, it seems very promising.

Lanczos Kernel

I then wrote a basic system for testing the actual resampling in non-real time. This is based on the idea of wanting to, effectively, perform the job of a DAC reconstruction filter in software, and then to be able to pick the reconstructed value at any non-integer sample time. To do this ‘properly’ it is necessary to sweep the samples on either side of the desired sample time with a sinc kernel i.e. convolve it. Here’s where it gets interesting. The kernel can be created so that its elements’ values compute the kernel as though centred on the exact non-integer sample time desired, even though it is aligned and calculated on the integer sample times.

It would be possible to calculate on-the-fly a new, exact kernel for every new sample, but this would be very processor intensive, involving many calculations. Instead, it is possible to pre-calculate a range of kernels that represent a few fractional positions between adjacent samples. In operation, the two kernels on either side of the desired non-integer sample time are swept and accumulated, and then linear interpolation between these two values used to find the value representing the exact sample time.

You may be horrified at the thought of linear interpolation until you realise that several thousand kernels could be pre-calculated and stored in memory, so that the error of the linear interpolation would be extremely small indeed.

Of course a true sinc function would extend to plus and minus infinity, so for practical filtering it needs to be windowed i.e. shortened and tapered to zero at the edges. Apparently – and I am no mathematician – the best window is a widened duplicate of the sinc function’s central lobe, and this is known as the Lanczos Kernel.

Using this arrangement I have been resampling some floating point sine waves at different pitches and examining the results in the program Audacity. The results when the spectrum is plotted seem to be flawless.

The exact width (and therefore quality) of the kernel and how many filters to create are yet to be determined.

[Another update] I have put the resampling code into the active crossover program running on an Intel Atom fanless PC. It has no trouble performing the resampling in real time – much to my amazement – so I now have a fully functional system that can take in TOSLink (from a CD player at the moment) and generate six analogue output channels for the two KEF-derived three-way speakers. Not as truly ‘perfect’ as the previous system that controls the rate at which data arrives, but not far off.

[Update 01/03/18] Everything has worked out OK, including the re-sampling described in a later post. I actually had it working before I managed to grasp fully in my head how it worked! But the necessary mental adjustments have been made, now.

However, I am finding that the number of platforms that provide S/PDIF or TOSLink outputs ‘out-of-the-box’ without problems is very small.

I would simply have bought a Chromecast Audio as the source, but apparently its Ogg Vorbis encoded lossy bit rate is limited to 256kbps with Spotify as the source (which is what I might be planning to use for these tests) as opposed to the 320 kbps that it uses with a PC.

So I thought I could just use a cheap USB sound card with a PC, but found that with Linux it did a very stupid thing: turned off the TOSLink output when no data was being written to it – which is, of course, a nightmare for the receiver software to deal with, especially if it is planning to base its resampling ratio on the received sample rate.

I then began messing around with old desktop machines and PCI sound cards. The Asus Xonar DS did the same ridiculous muting thing in Linux. The Creative X-Fi looked as though it was going to work, but then sent out 48 kHz when idling, and switched to the desired 44.1 kHz when sending music. Again, impossible for the receiver to deal with, and I could find no solution.

Only one permutation is working: Creative X-Fi PCI card in a Windows 7 machine with a freeware driver and app because Creative seemingly couldn’t be bothered to support anything after XP. The free app and driver is called ‘PAX’ and looks like an original Creative app – my thanks to Robert McClelland. Using it, it is possible to ensure bit perfect output, and in the Windows Control Panel app it is possible to force the output to 16 bit 44.1 kHz which is exactly what I need.

[Update 03/03/18] The general situation with TOSLink, PCs and consumer grade sound cards is dire, as far as I can tell. I bought one of these ubiquitous devices thinking that Ubuntu/Linux/Alsa would, of course, just work with it and TOSLink.

USB 6 Channel 5.1 External SPDIF Optical Digital Sound Card Audio Adapter for PC

It is reputedly based on the CM6206. At least the TOSLink output stays on all the time with this card, but it doesn’t work properly at 44.1 kHz even though Alsa seems happy at both ends: if you listen to a 1kHz sine wave played over this thing, it has a cyclic discontinuity somewhere – like it’s doing nearest neighbour resampling from 48 to 44.1 or something like that..? As a receiver it seems to work fine.

With Windows, it automatically installs drivers, but Control Panel->Manage Audio Devices->Properties indicates that it will only do 48 kHz sample rate. Windows probably does its own resampling so that Spotify happily works with it, and if I run my application expecting a 48 kHz sample rate, it all works – but I don’t want that extra layer of resampling.

As mentioned earlier I also bought one of these from Maplin (now about to go out of business). It, too, is supposedly based on the CM6206:

Under Linux/Alsa I can make it work as TOSLink receiver, but cannot make its output turn on except for a brief flash when plugging it in.

In Windows you have to install the driver (and large ‘app’ unfortunately) from the supplied CD. This then gives you the option to select various sample rates, etc. including the desired 44.1 kHz. Running Spotify, everything works except… when you pause, the TOSLink output turns off after a few seconds. Aaaaaghhh!

This really does seem very poor to me. The default should be that TOSLink stays on all the time, at a fixed, selected sample rate. Anything else is just a huge mess. Why are they turning it off? Some pathetic ‘environmental’ gesture? I may have to look into whether S/PDIF from other types of sound card is constantly running all the time, in which case a USB-S/PDIF sound card feeding a super-simple hardware-based S/PDIF-to-TOSLink converter would be a reliable solution – or simply use S/PDIF throughout, but I quite like the idea of the electrical isolation from TOSLink.

It’s not that I need this in order to listen to music, you understand – the original ‘bit perfect’ solution still works for now, and maybe always will – but I am just trying to make SPDIF/TOSLink work in principle so that I have a more general purpose, future-proof, system.

My mid-80s video framestore

I was looking through some old photos the other day and was reminded of a thing I built back in the mid-80s. I had become obsessed by the idea of building a device to capture and display photographic images at a time when no normal computer could do it. Your standard home computer like a BBC micro, for example, could only display a small number of colours and couldn’t even display a smoothly-graduated monochrome image. Later, more sophisticated computers like the Commodore Amiga couldn’t do it without strange restrictions on which colours could be adjacent to others.

I was fully aware of the basic idea of digitising waveforms and storing the results in RAM, having played around with audio sampling prior to this. I found that it was possible to buy chips that could generate frame and line sync pulses from a composite video stream i.e. the output of a standard video recorder or camcorder, and also to split composite video into R, G and B analogue components suitable for sampling. A standard computer monitor could take in separate sync & RGB signals so it wasn’t then necessary to do the reverse and generate composite video again.

Putting it all together, I could build a device that would enable me to grab a single video frame and store it in RAM. I could then replay the frame over and over, reconstituting it via three DACs, to be fed to a standard RGB monitor. I could also stop this process, and allow a computer (a BBC Micro) to read the contents of the RAM for storage on disk. The computer could also upload stored images into the RAM for display – and this would also allow for the possibility of ‘Photoshopping’ images or synthesising them in software.

The pièce de résistance was that what fell out of this arrangement was a live digitised image on the monitor that could be frozen by pressing a button.

As I recall, the main technical hurdles were:

  1. High speed ADCs and DACs were expensive and/or outside my comfort zone. In the end, I used three 6-bit ‘flash’ ADCs, and my own home-made R-2R DACs. Consequently, I could capture and display 262,144 colours which doesn’t sound much compared to today’s standard 16 million but was adequate. In monochrome I could display a 64 grey scale image which was sufficient to be called ‘photographic’.
  2. How to lock my pixel clock to the incoming video stream. As a stopgap while I thought of something better, I made a super-simple analogue oscillator out of CMOS Schmitt triggers that could be started (as opposed to its output being gated) by setting an input logic level.
  3. RAM was pretty expensive – except for dynamic RAM, and I thought this was too complicated to contemplate. In the end I used a bunch of static RAM chips to give me a resolution of 256×256 pixels. Again it doesn’t sound like much, but with the relatively fine colour graduations, it was not too bad at the time.
  4. A standard UK PAL video frame has 625 lines (although only 576 lines are visible) comprising two interlaced fields of half that number of lines. If I was aiming for a resolution of 256 pixels, I clearly could not digitise the whole frame. In the end I think I sampled and displayed just one of the fields, cropping the middle 256 lines out of the 288 visible lines by starting to digitise once a certain line count was reached after the top of the field. When displaying a sampled image, the same image was in effect displayed in each field.
  5. I needed to make double-sided PCBs for at least part of this device in order to simplify its construction. This involved arduous work with acetate sheets, self-adhesive tape and transfer symbols, and a scalpel.

The uppermost of a stack of three identical PCBs incorporating memory, computer I/O, ADC and DAC for the red, green or blue component.

I eventually made it work pretty well. I started with a single channel of monochrome, and I remember that the first time I ‘froze’ a perfect monochrome image was one of those moments that I probably live for.

I didn’t progress beyond the simple analogue pixel clock – which effectively meant that I set the image’s horizontal width with a potentiometer. It seemed to work perfectly well.

Of course, as so often happens, once the initial thrill was over I didn’t use it much after that, eventually putting the thing to the back of a cupboard and never touching it again – it is still there!

Here are a few of the images I grabbed, mainly from broadcast TV or pointing a camera at magazines. As you can see, it actually worked.


(I seem to remember transferring the images from the BBC Micro via serial cable to a PC in 1998 – the datestamp on the images –  and, knowing me, probably substituted the raw image data into a 256×256 image created in Photoshop or similar so I never had to actually understand the image header. I would then have resized the images from their non-square pixels to what looked right on the PC monitor using Photoshop. There would have been no more image manipulation after that, so these are effectively the raw images).

The Trouble with Hobbies

Have you ever suddenly been inspired to embark on a brand new hobby?

Maybe you’ve never owned a boat before, but having seen one chug by on the river you have thought “I’d love to do that!”. A quick browse in the classified ads shows lots of boats that look fine, and they don’t cost all that much. Basically any boat would be great, and you could gradually do it up, even if it is a bit shabby now. In your mind’s eye, your family will love you when you are able to take them on spur-of-the-moment, cheap weekends messing about on the water, starting in a few weeks’ time.

From this high point where the world is your oyster, you begin to take the advice of the magazines and other experienced hobbyists. Before you have even owned a boat, you become aware of the hierarchy of boat owners, and the boats that would render you a laughing stock if you owned them. You become aware of the general consensus on different types of bilge pump – not something you ever wanted to know. You begin to form an idea of the boat you should really go for – and it is not one of the bargain basement jobs you first saw. You might just about be able to stretch to a boat that would put you in the lower echelons of boat ownership but, importantly, not on the very lowest rung. You could always, perhaps, move up from there over time.

It now turns into an all-consuming hobby with the goal of having a boat on the river at the end of the year. In the end it costs thousands, and your children have grown up and left home before your boat finally takes to the water. You hit a bridge and rip the top off your boat the first time you take it out. You feel sick and abandon the whole hobby (a true story).

That’s the nature of male hobbies. They start out as wonderful, spontaneous ideas, but can turn into nightmares – mainly due to the existence of other hobbyists! Audio is one of those hobbies, I think. Ridiculously, the prices paid for bits of audio knickknackery rival the costs of boats.

A person could be seized one day by the idea of hi-fi as a way to improve their life, buy an amp and some secondhand speakers off Gumtree for £100, and plug their tablet or laptop headphone socket into the amp using a £2 cable. Hey presto, a hi-fi system that will sound much better than what they had before, and which has tinker-ability via the buying and selling of speakers and the audio streaming/library software options; there is no urgency in changing the amp and tablet hardware as they are pretty much perfect in what they do. The speakers are almost like pieces of furniture, so the person can indulge their tastes in how they look as well as how they sound, and they can be restored using standard DIY skills – a nice mini-hobby.

But what if the person does the natural male thing, and starts to read the magazines and forums? Immediately they will realise that their tablet’s headphone output is a joke in the audio world. They need to spend at least a few hundred pounds on a half-decent ‘DAC’, plus a couple of hundred on a budget cable. And of course, this is only for convenience: real audio quality can only be had if they own a decent turntable and a special vibration-free shelf to put it on. Where do they go from there? They need to make a decision on which turntable and which cartridge to go for. They need to take a view on cables, power conditioners, valve or solid state amps, accessories like cable lifters and record cleaning machines. Each decision, they are assured by their fellow hobbyists, will result in “night and day” differences in the sound.

After some months agonising over it, they assemble a beginner’s system for about £3,000 – they will upgrade as budget allows. It sounds OK, but they know that even though the brand is a highly recommended one, the particular model of valve amplifier they could afford has “hints of a slightly reticent mid range” – one of the magazines said so – and if they listen carefully, perhaps they can hear that… But the more powerful 18 Watt model cost £800 more and they decided against it. Perhaps they made the wrong decision. The nightmare unfolds…

Pop and click remover, old electronics magazines

Just saw a short article about a new product that aims to remove the pops and clicks from vinyl records. It…

…digitizes the signal at 192/24 bit resolution and then uses a “non-destructive” real time program that removes pops and clicks without, the company claims, damaging the music.

…In addition to real-time, non-destructive click & pop Removal the SC-1 features user controllable click & pop removal “strength”, a pushbutton audiophile-grade “bypass” that lets you hear non-digitized versus digitized signal (for when you don’t need pop and click removal), iOS and Android mobile app control and 192/24 bit hi-res digital processing.

Of course it is highly ironic that a vinyl enthusiast should need the services of the digital world to improve the sound of his recordings. And it is obvious (surely) that the digital stream could be stored for later replay without needing to further degrade the original vinyl or wear out the multi-thousand dollar stylus that is no doubt being used. (Omitting to mention the most obvious idea of just listening to a digital recording…)

The aim of the product reminded me of a certain project in an old electronics magazine, a huge number of which I still have in a set of bookshelves that I haven’t touched since 1990 – the date of the last magazine I seem to have bought. Sifting through them, it is amazing how familiar the front covers still are –  a measure of the intensity of youthful hobbies.


From Electronics Today International in April 1979, the project I remembered was a ‘Click Eliminator’ for vinyl records based on an analogue CCD delay line. The idea was to insert a few milliseconds of silence in place of the offensive click. Here’s how it worked:


Electronics Today International was the magazine I would go to WH Smiths for on a Saturday, being terribly disappointed if the latest issue wasn’t in. I would say more than 50% of issues featured an audio or hi-fi project: from 1982 an active speaker project for example, or from 1986 “Can Valves make a comeback?” with an accompanying valve amp project. There were any number of MOSFET amps, phono pre-amps, tape noise reduction units. Electronic music featured prominently with projects for effects pedals and synthesisers galore. I devoured this stuff.

Other magazines included: Practical Electronics, Wireless World, Everyday Electronics, Elektor, Electronics and Music Maker, and one I didn’t recall Hobby Electronics. I also bought any number of computer magazines. I have never thrown any away, so I have hundreds of them gathering dust.

Thoughts on creating stuff


The mysterious driver at the bottom is the original tweeter left in place to avoid having to plug the hole

I just spent an enjoyable evening tuning my converted KEF Concord III speakers. Faced with three drivers in a box, I was able to do the following:

  • Make impulse response measurements of the drivers – near and far field as appropriate to the size and frequency ranges of the drivers (although it’s not a great room for making the far field measurements in)
  • Apply linear phase crossovers at 500Hz/3100Hz with a 4th order slope. Much scope for changing these later.
  • Correct the drivers’ phase based on the measurements.
  • Apply baffle step compensation using a formula based on baffle width.
  • Trim the gain of each driver.
  • Adjust delays by ear to get the ‘fullest’ pink noise sound over several positions around the listening position.
  • ‘Overwrite’ the woofer’s natural response to obtain a new corner frequency at 40 Hz with 12dB per octave roll off.

The KEFs are now sounding beautiful although I didn’t do any room measurements as such – maybe later. Instead, I have been using more of a ‘feedforward’ technique i.e. trust the polypropylene drivers to behave over the narrow frequency ranges we’re using, and don’t mess about with them too much.

The benefits of good imaging

There is lovely deep bass, and the imaging is spectacular – even better than my bigger system. There really is no way to tell that a voice from the middle of the ‘soundstage’ is coming from anywhere but straight ahead and not from the two speakers at the sides. As a result, not only are the individual acoustic sources well separated, but the acoustic surroundings are also reproduced better. These aspects, I think, may be responsible for more than just the enjoyment of hearing voices and instruments coming from different places: I think that imaging, when done well, may trump other aspects of the system. Poorly implemented stereo is probably more confusing to the ear/brain than mono, leaving the listener in no doubt that they are listening to an artificial system. With good stereo, it becomes possible to simply listen to music without thinking about anything else.

Build a four way?

In conjunction with the standard expectation bias warning, I would say the overall sound of the KEFs (so far) is subtly different from my big system and I suspect the baffle widths will have something to do with this – as well as the obvious fact that the 8 inch woofers have got half the area of 12 inch drivers, and the enclosures are one third the volume.

A truly terrible thought is taking shape, however: what would it sound like if I combined these speakers with the 12 inch woofers and enclosures from my large system, to make a huge four way system..? No, I must put the thought out of my head…

The passive alternative

How could all this be done with passive crossovers? How many iterations of the settings did it take me to get to here? Fifty maybe? Surely it would be impossible to do anything like this with soldering irons and bits of wire and passive components. I suppose some people would say that with a comprehensive set of measurements, it would be possible to push a button on a computer and get it to calculate the optimum configuration of resistors, capacitors and inductors to match the target response. Possibly, but (a) it can never work as well as an active system (literally, it can’t – no point in pretending that the two systems are equivalent), and (b) you have to know what your target response is in the first place. It must surely always be a bit of an art, with multiple iterations needed to home in on a really good ‘envelope’ of settings – I am not saying that there is some unique golden combination that is best in every way.

In developing a passive system, every iteration would take between minutes and hours to complete and I don’t think you would get anywhere near the accuracy of matching of responses between adjacent drivers and so on. I wouldn’t even attempt such a thing without first building a computerised box of relays and passive components that could automatically implement the crossover from a SPICE model or whatever output my software produced – it would be quite big box, I think. (A new product idea?)

Something real

With these KEFs, I feel that I have achieved something real which, I think, contrasts strongly with the preoccupations of many technically-oriented audio enthusiasts. In forums I see threads lasting tens or even hundreds of pages concerning the efficacy of USB “re-clockers” or similar. Theory says they don’t do anything; measurements show they don’t do anything (or even make things worse with added ground noise); enthusiasts claim they make a night and day improvement to the sound -> let’s have a listening test; it shows there is no improvement; there must have been something wrong with the test -> let’s do it again.

Or investigations of which lossless file format sounds best. Or which type of ethernet cable is the most musical.

Then there’s MQA and the idea that we must use higher sample rates and ‘de-blurring’ because timing is critical. Then the result is played through passive speakers with massive timing errors between the drivers.

All of these people have far more expertise than me in everything to do with audio, yet they spend their precious time on stuff that produces, literally, nothing.

New bass drivers for KEF Concords

Finally got round to ordering some better bass drivers for the KEF Concord III conversion at the very high end price of £19 each.

They’re Skytronic 902.208 8″ polypropylene drivers, and as you can see, they’re quite a bit beefier magnet-wise than the Peerless SKO200.

There seems to be some confusion about the Thiele Small parameters for this driver. As far as I can tell, the ones here are correct. It probably works out that the 30l KEF cabinets are too small, and we end up with a Q of 0.97. No matter.

I have measured the driver in the cabinet in the nearfield, and attempted to correct it for phase and amplitude, and then modified the filter to give me a driver with 38 Hz corner frequency and a roll-off at 12dB per octave. The cones move quite a lot sometimes, but the sound is good.


902.208 mounted in place on the KEF III. The diameter of this driver rim is smaller than both the originals and the previous Peerless replacements, hence the need to clamp the driver as there isn’t sufficient wood to screw into.

Software: the future of audio

Last night, on a whim, I decided that I would like my active crossover software to display some sort of indication of the output levels being sent to the DACs. This is quite important, and something that I should have tackled quite a while ago. Basically, we should be worried about clipping, and also ‘overs’ i.e. those interpolated samples that are generated by DAC reconstruction filters in between the recorded samples and which have the potential to clip even though the recording does not, directly. By messing around with various types of driver correction and so on, am I running the risk of clipping? Or, am I wasting DAC resolution by needlessly attenuating my DAC outputs too much?

Here is how easy it was to display the information in a useful and aesthetically pleasing way:

  • I created six vertical rectangular areas on the active crossover app’s screen – one bargraph for each DAC output.
  • I decided upon a linear percentage display (not dB) and an update rate of 10 Hz
  • A timer was set to trigger at 10 Hz (the timer is provided by the GTK GUI library) and call the function to draw the six bargraphs
  • In the output function for the DACs, I take the absolute value of each sample as I write it to the DAC and compare it to the maximum recorded so far for that channel (out of six channels). I overwrite the maximum if it is exceeded. There is a ‘mutex’ interlock around the maximum value to prevent the bargraph drawing function from accessing it at the same moment.
  • The bargraph drawing function for each channel accesses that maximum recorded value and saves it. The maximum value for that channel is then reset to zero. The saved value is compared against that bargraph’s previous displayed value. If it is greater, a coloured rectangle is drawn directly proportional in length to the value. If it is less, the previous value is multiplied by 0.9, and the rectangle drawn to that height, instead. With this simple system, we have a PPM-style display that shows signal peaks that slowly decay.
  • The bargraph display function also records an absolute maximum for that channel, which doesn’t get reset. This value is displayed as a red horizontal line, thus showing the maximum output level for that particular listening session.

The result is one of those attractive arrays of VU meters that dances in response to the incoming signal levels. The results were interesting, and will alert me to any future mis-steps with regard to clipping – it still doesn’t tackle the issue of ‘overs’ directly, however.

But the reason for mentioning it, is to show the power and simplicity of engineering with software. To build a PPM meter in hardware and wire it all up, would not be trivial, and would take days, weeks or months for a commercial product. In software, it takes less than an hour and a half to construct it from scratch. Audio processing functions are equally simple to create and integrate within the system. It seems clear that once the basic DSP ‘engine’ is in place, complex audio systems can be put together like Lego. A perfectly capable three-way speaker can be built in days. It is not too hard to see how a three-way, six channel DSP system could simply be scaled up to create something like the Beolab 90.

Is this an exciting trend, or the end of everything that makes audio interesting? I think it is the former, but I can see that many traditionalists might disagree.

KEF Concord III conversion

kef badge

Recently, I thought I might try to combine modern technology with the styling of 70s hi-fi by converting a pair of KEF Concord IIIs to work with DSP active crossovers, and also upgrade them from 2.5-way to 3-way with all-new drivers. The scheme is based on the same software and DAC that I used for my earlier DIY effort.


Some KEF Concord IIIs (not my particular pair) []

I bought the KEFs a few years ago because I thought they looked fabulous. I thought they would sound OK because they’re not tiny and contain two 8 inch drivers. I was wrong: to me they sounded weak and ‘boxy’, so it required no soul-searching for me to decide to modify them irreversibly. Who knows: maybe they had bad capacitors or something, but as you might have guessed, I probably wasn’t going to be keeping them in their original form, anyway.

I didn’t give my conversion project much planning. I already had some Peerless 8″ polypropylene drivers bought very cheap, which WinISD indicated were perfect for the enclosures, and I thought I could cross these over to 3″ drivers rather than the 4″ I used for my big speakers; I duly bought some Monacor SPH75/8 polypropylene mid-bass drivers. I thought about using 19mm tweeters, but in the end plumped for the same Monacor DT25 as I used in my main system because of their small size, particularly the front flange. All pretty cheap.

The KEFs are stylishly covered in a fabric ‘sock’ that was no doubt very cheap to make, but I think looks good. (There is even the possibility of commissioning the very talented mother-in-law to make new ones in funky colours).

I removed the small plinth at the base of the speaker (four long wood screws) and peeled back the ‘sock’ from there to reveal a rounded chipboard enclosure and the three drivers – the Concord is a 2.5-way system. I decided that I would replace one of the 8″ drivers with my mid and tweeter, and that I should therefore invert the enclosure in order to keep all three drivers close together with the tweeter close to the top of the enclosure. I removed the two 8″ drivers but left the original tweeter in position as a ‘plug’ for its hole.

I dusted off the router and made two 18mm MDF flanged discs to replace the 8″ drivers. I should have made the flanges wider because they’re not quite wide enough to take a screw head and clear the necessary foam gasket underneath, meaning I’ll have to clamp them externally. I painted them to seal in the sawdust.

The SPH75/8 is troublingly difficult to mount for a one-off hand-made ‘rapid prototype’: a virtually non-existent flange from the front or behind, and a magnet that is almost as wide as the driver, meaning that if you mount it from the front, there’s almost no gap for the air to flow around unless you widen out the area around the driver from behind. It’s squarish, so if you mount it from behind but don’t want the full thickness of the baffle in the way, you end up having to accommodate the corners, which is fiddly without machining a complex-shaped recess. I ended up mounting the driver from behind, shaping the corners with a chisel. Next time, I will definitely find a woodworking expert to make the ‘plugs’ to my CAD designs!

I needed to make a chamber for the SPH75/8. WinISD told me it should ideally be 3 litres or so – but probably not all that critical for the mid range. I figured the easiest way to do it would be some 110mm plastic piping from the local DIY shop which is quite thick and fairly ‘dead’ if you knock it. I could even buy a ready-made fitting to allow me to plug the end. I duly made an assembly and fastened it to the rear of the MDF ‘plug’ using some bent aluminium brackets. I stuffed it with speaker wadding. The volume works out at about 2 litres, so not far off ideal.

IMG_0488 cropped

Mid range chamber made from 110 mm plastic pipe and end cap. Hopefully airtight by virtue of neoprene foam gaskets. It is stuffed with wadding .

Using self-adhesive neoprene foam and P-section draught excluder (this really does make a great seal), and plugging various holes, I rendered the mid range and bass enclosures pretty airtight. A top tip: hot melt glue is your friend. It plugs holes and gaps perfectly, and I have found that with a quick application it doesn’t seem to melt PVC cable insulation or ABS, so it’s ideal when you just want to feed cables through a hole in wood or plastic and seal the hole.

Crudely fastening it all together, I fired up one speaker to have a quick listen using slightly modified settings from my big system. I found it really interesting and encouraging, but when the bass drivers were played in isolation there was audible distortion. I worried that it might be the enclosures (they are made from mere 15mm chipboard), but I eventually narrowed it down to the drivers.


A modified KEF Concord. Those particular pieces of foam are just a temporary experiment, and would be too thick to fit under the fabric cover, anyway.

The mid and tweeter sounded sound spot on.

After building up the second speaker, the next stage was to set them up slightly more scientifically than before. I measured them (woofer near field, and mid and tweeter far field ‘pseudo-anechoically’) and applied roughly the appropriate correction to each driver (phase and frequency response, delay, gain). I also implemented bass extending EQ to aim for the same response as my big system(!) i.e. a 38Hz -3dB point. It sounded pretty reasonable, but I knew the bass drivers were not very good.

Next, I replaced the bass drivers with some cheap but much better Skytronic units (the same brand as my larger speakers). I made the appropriate measurements and compensations in the DSP, and raised the -3dB point to 40 Hz for the sake of reducing the power into the bass drivers.

I added some bracing to the most obviously flappy bits of the KEF Concord enclosures. Broom handle was much cheaper than dowel of the same diameter! The black square between dowel and enclosure is 1mm neoprene sheet. Dowel held in with countersunk wood screws from outside the enclosure.

Yes, the photos make it all look very ‘agricultural’, and the wide angle iPhone lens makes this bit of it look anything but square and perpendicular, but it is actually about right, and the speakers are solid, airtight, etc. where it matters.


Did the bracing change the sound? Can’t say, but it had to be done. I measured the driver in the near field again, and it hadn’t changed at all.

I re-fitted the fabric ‘socks’ – which I managed to get wrinkle-free much to my surprise.

finished KEF

A KEF Concord III with its fabric covering restored

As mentioned before, I ended up inverting the enclosures which meant that I had to remove the fabric ‘socks’ which were stapled very close to the ‘lip’ that is formed at the top of the enclosure. I was worried that I couldn’t find a staple gun that could get right into the corner of this lip, but in the end I found that an ordinary office stapler could do the job, which was fine. At the bottom of the enclosure, there are drawstrings which are pulled tight and tied off. The fabric stretches, so it forms a very flat covering.

The coverings are in pretty good condition for speakers over 37 years old, with just a couple of snags and small holes. They have faded from black to a very dark blue over the years which is only obvious if any of the non-faded material becomes visible through any slight misalignment. New coverings could be made in a variety of colours, but I think it would be preferable to retain the moderately coarse texture of the original material if possible.

Something that seems to have been an irritation to the previous owner is that the tops of the speakers are capped off with a square of hardboard covered with fabric, and over time these have warped, with the corners rising slightly. These have been re-applied by the previous owner using No-More-Nails or similar, to no avail.

In the end, I restored the fabric caps by carefully removing the material from the original hardboard, and stretching them over sheets of black-sprayed 2mm aluminium, fastening them to it with carpet tape. These now fasten to the tops of the speakers using Velcro. They look really good. I gave the pieces of cloth a rinse in warm water because they were looking a bit grotty, having taken the brunt of spilled drinks at parties over the years I imagine. It looks to me that it would be possible to give the whole fabric sock a proper wash, and it would survive OK.

I resprayed the wooden plinths with black satin paint, and the same for the 1970s speaker stands with casters.

I also decided to try the option of some of the original ‘inverted mushroom’ stands. I bought some of Mk IV ‘donor’ speakers, but unfortunately had to do a bit of metalwork to make these work with the Mk III. They now look ‘the business’, and I am very much enjoying their sound.

October 2017: Step Response Measurement

I decided to measure the speakers for time alignment. In order to do this, I measured with a microphone at tweeter height and 1m away from the speaker – just to make it the standard measurement position. I used REW to make the measurement using a sweep from 10Hz to 20 kHz and duration about 24s.


This is the result I got:

concored step response

I am assuming that this shows that it is pretty reasonable. In Stereophile’s 1997 article on measuring speakers they show a similar image:

Fig.11 shows a good step response produced by a time-coherent, three-way loudspeaker, with the outputs of the three drive-units adding in-phase at the microphone position. There are not that many speakers that produce this good a step response. Of the speakers I have measured for Stereophile, only about 10—models from Quad, Thiel, Dunlavy, Spica, and Vandersteen—have step responses this good.

Fig.12 shows a more typical step response, again of a three-way loudspeaker. This time there are actually three step responses apparent in the graph: a narrow, positive-going step response from the tweeter; the next, negative-going step is the midrange unit (as will be seen, it’s connected with opposite polarity to the tweeter); with finally a slow, wide positive pulse from the woofer.

If you do a Google Image Search for ‘stereophile step response’ the results are quite interesting: true step responses are still quite rare.

Scalford Hall 2017


I took my KEF Concords to the HiFiWigwam show in March. The room may have been smaller than last year’s, and I didn’t think the speakers sounded as good as they might. Nevertheless I found a few nice comments on the web.

Generally this year all the best sounding systems were active, with a system by “looper” which cost less than £300 with cheap drivers and FIR filters playing through a AV amp – leaving most high end speakers systems at the show in its wake.

Big shout out to Looper in room 232 who proved you don’t have to spend thousands to enjoy decent hifi. 

Looper in 232 had a similar home made set of filters driving the rebuilt Concords.  Better sounding than they were in 1975 I’m sure.  It was great talking to him about his thoughts and design ideas, and he managed it on a very tight budget.

All rooms were great but the things that got me going back for more –

…- Looper’s lovely KEF concord III

I was able to do the same ‘party trick’ as my other DIY system, where I showed that changing the crossover frequencies and slopes in real time – even quite drastically – had virtually no audible effect on the sound. If, however, we listened to, say, the mid range driver in isolation, the change was plainly obvious. I would say that this was what you would expect from a correctly set up system with not-too-bad dispersion characteristics, but most people had never encountered it before. The ‘trick’ depends on all the filters being calculated and implemented on-the-fly, and the fixed driver correction and in-room EQ being implemented as separate layers that are overlaid on the crossover filters. I think this demonstration is a kind of sanity check that your setup is somewhere near where it needs to be.

I can’t say why this is considered so unusual, but the fact that it is could go part of the way in explaining why most speakers sound a bit ‘odd’, and why speaker design is considered to be a mysterious art rather than a very straightforward procedure. It seems probable that the results would not be as transparent with a two-way speaker as a three-way because this introduces several new factors into the equation: significant driver beaming – a phenomenon that cannot really be corrected or neutralised – and issues with the drivers having to cover wider frequency ranges. Also, non-linear phase crossovers introduce “phase rotations” through the crossover.

[Last edited 30/06/17]



What should we be listening for?

As you may have seen, I built my own audio system because I had never before heard a system which took the seemingly obvious steps of using large sealed woofers, time alignment, DSP crossovers, driver correction etc., and I wanted to know what it sounded like. I also wanted to experiment with various aspects of crossover design (although I ended up doing less of this than I expected), and to understand what is important versus what is myth. Maybe DEQX, Kii, B&O or Meridian could have sold me something that sounded good, but it would have been an expensive black box that I couldn’t tinker with.

What have I learned from listening to my DIY system? I think the following:

There is a superficial ‘hi-fi’ sound that is achieved by many conventional systems – and I have owned some of these systems. The frequency response is balanced. Harmonic distortion is low – it sounds ‘clean’ at moderate volumes – it has bass and top end in seemingly generous amounts. It is certainly ‘stereo’ as you can clearly hear different things coming from the left, right and middle. If you start to turn up the volume, it does begin to sound somewhat ‘loud’ and ragged, and it can also feel as though the sound is being extruded through an opening that’s slightly too small. After a long listening session at high volumes your ears feel quite ‘sore’, but you are reasonably satisfied with the sound. This, presumably, is how recordings must really sound – you certainly can’t put your finger on anything that isn’t a reasonable facsimile of what it is supposed to be.

And then there are other, more specialised systems which cost much more to buy, and take us into the realms of audiophilia. They’re often impressive to look at, but I think they can sometimes sacrifice “high fidelity” in order to indulge their creator’s interest in a particular material or ‘retro’ technology. You may disagree.

There should be the possibility of a system that just implements the obvious pragmatic steps necessary to get the recorded waveform out of the speakers reasonably accurately (however we define that). There aren’t many of these about, it seems to me. I have heard only one such system – the one I put together using cheap off-the-shelf parts and DSP ‘glue’ – and I find its sound to be different and, if I may say so, better than conventional systems. Here’s what I think it sounds like:

The first thing that strikes you is that although it is ‘clean’, ‘sweeter’ and less ‘edgy’ than the conventional system, it also has ‘flavour’ and ‘body’ – it sounds just like real music. If it’s a double bass being plucked, you hear the fingers releasing the string, and the sound hits you in the chest; if it’s a hi-hat cymbal being struck, you hear the stick meeting the cymbal in delicate detail; if the sound is being made by something heavy and solid, or wooden and hollow, you picture the object making the sound. A part of the dynamism of the sound is how quickly it stops, as well as how quickly it starts. Bass is ‘real’ and not just a ‘rumble generator’; there is no arbitrary limit on how low the bass can go – not that you analyse the sound in those terms. It is simply ‘real’.

Next, you notice the imaging, the clear separation of the instruments, and the acoustics. The person singing is in front of you, located at some position in space between or behind the speakers – you feel as if you could reach forward and touch them. If it’s a live acoustic recording, they are in an acoustic space, and you are there too. There is separation between the singer and other instruments spread around the space – if that is how they were recorded. In a studio recording, maybe the vocalist is in a smaller space than your listening room, and you picture them close to the mic in a booth, perhaps, singing towards you. Or if the sound is coming from within a cool, stone cathedral, you picture the cathedral extending beyond the walls of your room. This is definitely the ‘party trick’ of stereo – a compelling, coherent acoustic space that appears as if by magic in thin air.

Then, you realise that you can listen to the recording at its intended volume. The ‘natural’ volume setting for each recording is usually quite apparent, but the system doesn’t mind what volume you set it at. The copious, dynamic bass means that you are not tempted to turn up the level excessively to compensate for a missing part of the spectrum. Your ears don’t ring afterwards, and even after listening for long periods at what would normally be considered high volume, they don’t feel sore. And there is a physical element to loud, natural, dynamic music that generates an excitement all its own.

It doesn’t need training or experience to appreciate these aspects of the sound, but at the same time there is just so much more to hear in the recording. You become more engaged with it; involvement rather than passively observing a superficially pleasant sound wafting over you.

Thinking about it some more, it is obvious that what I am describing is the sound of the recording, not the system. Some would describe this as a “neutral” system, but the mistake would then be to say that the resulting sound is neutral; I think recordings are astonishing if only we get to hear them without an intervening interpretation.