Nowadays we are used to the concept of the lossy codec that can reduce the bit rate of CD-quality audio by a factor of, say, 5 without much audible degradation. We are also accustomed to lossless compression which can halve the bit rate without any degradation at all.
But many people may not realise that they were listening to digital audio and a form of lossy compression in the 1970s and 80s!
Early BBC PCM
As described here, the BBC were experimenting with digital audio as early as the 1960s, and in the early 70s they wired up much of the UK FM transmitter network with PCM links in order to eliminate the hum, noise, distortion and frequency response errors that were inevitable with the previous analogue links.
So listeners were already hearing 13-bit audio at a sample rate of 32 kHz when they tuned into FM radio in the 1970s. I was completely unaware of this at the time, and it is ironic that many audiophiles still think that FM radio sounds good but wouldn’t touch digital audio with a bargepole.
13 bits was pretty high quality in terms of signal-to-noise-ratio, and the 32 kHz sample rate gave something approaching 15 kHz audio bandwidth which, for many people’s hearing, would be more than adequate. The quality was, however, objectively inferior to that of the Compact Disc that came later.
Downsampling to 10 bits
In the later 70s, in order to multiplex more stations into a lower bandwidth, the BBC wanted to compress higher quality 14-bit audio down to 10 bits
As you may be aware, downsampling to a lower bit depth leads to a higher level of background noise due to the reduced resolution and the mandatory addition of dither noise. For 10 bits with dither, the best that could be achieved would be a signal to noise ratio of 54 dB (I think I am right in saying) although the modern technique of noise shaping the dither can reduce the audibility of the quantisation noise.
This would not have been acceptable audible quality for the BBC.
Companding Noise Reduction
Compression-expansion is a noise reduction technique that was already used with analogue tape recorders e.g. the dbx noise reduction system. Here, the signal’s dynamic range is squashed during recording i.e. the quiet sections are boosted in level, following a specific ‘law’. Upon replay, the inverse ‘law’ is followed in order to restore the original dynamic range. In doing so, any noise which has been added during recording is boosted downwards in level, reducing its audibility.
With such a system, the recorded signal itself carries the information necessary to control the expander, so compressor and expander need to track each other accurately in terms of the relationships between gain, level and time. Different time constants may be used for ‘attack’ and ‘release’ and these are a compromise between rapid noise reduction and audible side effects such as ‘pumping’ and ‘breathing’. The noise itself is being modulated in level, and this can be audible against certain signals more than others. Frequency selective pre- and de-emphasis can also help to tailor the audible quality of the result.
The BBC investigated conventional analogue companding before they turned to the pure digital equivalent.
The BBC called their digital equivalent of analogue companding ‘NICAM’ (Near Instantaneously Companded Audio Multiplex). It is much, much simpler, and more precise and effective than the analogue version.
It is as simple as this:
- Sample the signal at full resolution (14 bits for the BBC)
- Divide the digitised stream into time-based chunks (1ms was the duration they decided upon);
- For each chunk, find the maximum absolute level within it;
- For all samples in that chunk, do a binary shift sufficient to bring all the samples down to within the target bit depth (e.g. 10 bits);
- Transmit the shifted samples, plus a single value indicating by how much they have been shifted;
- At the other end, restore the full range by shifting samples in the opposite direction by the appropriate number of bits for each chunk.
Using this system, all ‘quiet chunks’ i.e. those already below the 10 bit maximum value are sent unchanged. Chunks containing values that are higher in level than 10 bits lose their least significant bits, but this loss of resolution is masked by the louder signal level. Compared to modern lossy codecs, this method requires minimal DSP and could be performed without software using dedicated circuits based on logic gates, shift registers and memory chips.
You may be surprised at how effective it is. I have written a program to demonstrate it, and in order to really emphasise how good it is, I have compressed the original signal into 8 bits, not the 10 that the BBC used.
In the following clip, a CD-quality recording has been converted as follows:
- 0-10s is the raw full-resolution data
- 10-20s is the sound of the signal downsampled to 8 bits with dither – notice the noise!
- 20-40s is the signal compressed NICAM-style into 8 bits and restored at the other end.
I think it is much better than we might have expected…
(I was wanting to start with high quality, so I got the music extract from here:
This is the web site of a label providing extracts of their own high quality recordings in various formats for evaluation purposes. I hope they don’t mind me using one of their excellent recorded extracts as the source for my experiment).