Audio Objects

Some audio pessimists are convinced that because a stereo recording and reproduction system can only sample a couple of infinitesimal points within the overall ‘sound field’, it is futile to imagine that the result can be anything but a pale imitation of the real thing.

Others are convinced that although the efforts of recording engineers mean that the recording itself is passable, the problem is that speakers playing in a real room are not conveying it to their ears accurately enough. They attempt to alter what comes out of the speakers in order to compensate for the room.

And stereo itself when reproduced over speakers is assumed to be so flawed due to crosstalk to the ‘wrong’ ear that it can’t possibly work, and we must be deluding ourselves if we think it does.

These are assumptions made by people who cannot allow themselves to enjoy their audio systems. I suggest they are fixated on the wrong things and the situation is much better than they imagine. A different way to view the problem of audio is this:

It is a mistake to think that the aim of the system is to recreate the precise waveform that would have reached the listener’s ear at the actual performance. It is not practically achievable, would not necessarily reproduce a realistic perception of the actual performance in the context of the listener’s own room anyway, and also it is not necessary. Most people couldn’t even tell you which of two plausible versions of the waveform are absolutely correct, and that is because they’re not hearing a waveform; they’re hearing musical and acoustic ‘objects’. It is the relationship between those objects that is paramount.

An ‘object’ could be:

  • A voice
  • A choir
  • Silence
  • A sad note
  • A happy chord
  • Song lyrics
  • A violin
  • A rhythm
  • An orchestra
  • A concert hall
  • Tension

The primary aim of a hi-fi system (as opposed to a kitchen radio, for example) is to maintain the integrity of single objects and the separation of different objects.

The secondary aim of the hi-fi system is to present the objects in a plausible way that allows for the normal behaviour of the listener; the sound basically appearing to emanate from in front of the listener, separable by distance and direction, without strange acoustic sensations if they turn to talk to their neigbour.

And that’s it. Everything flows from there.

  • Harmonic distortion (and the corresponding intermodulation distortion) smears objects together.
  • Bumps and dips in the frequency and phase response of a speaker smears the objects together and punches holes in the integrity of the objects.
  • Noise smears itself over all the objects, obscuring their separation.
  • Limited bass damages the integrity of certain objects and smears those objects together.
  • Timing errors smear objects together. Resonators in speakers (e.g. bass reflex) that take time to ‘get going’ and time ‘to stop’ damage and smear the objects together.
  • Stereo obviously aids in separating objects. Just a pair of speakers provides a continuous spread of individual, separate acoustic sources. And stereo over speakers isn’t flawed; the crosstalk to the ‘wrong ear’ is how it produces the image in the first place.
  • Realistic volume helps to elevate objects above the noise floor, with a more natural sound due to our hearing’s volume-dependent frequency sensitivity.

So some objects make it out of a kitchen radio OK: a rhythm, a melody or the words of a song. But other objects may be severely damaged or smeared together. On a hi-fi system you might hear two separate guitars but on the radio they’re just a wash over the whole sound. On the hi-fi you hear a startling, deep bass note, but on the radio there’s nothing.

And the hi-fi system does things ‘without trying’ – which is why some people can’t believe it’s doing them. The stereo system with speakers automatically creates a two-way interaction between the listeners and the performance because both are subject to the listening room’s acoustics. This also solves the problem of how to cram a concert hall into the listener’s room as well as the more intimate performances. Is the aim for the musicians and venue to come to the listener or for the listener to go to the performance? The stereo system with speakers creates a hybrid: regard it as the listener’s room being transported to the venue and its end wall being opened up.

What more do we want?

As I sit here listening to some big symphonic music playing on my ‘KEF’ DSP-based active crossover stereo system, I am struck by the thought: how could it be any better?

I sometimes read columns where people wonder about the future of audio, as though continuous progress is natural and inevitable – and as though we are accustomed to such progress. But it does occur to me that there is no reason why we cannot have reached the point of practical perfection already.

I think the desire for exotic improvements over what we have now has to be seen within the context of most people having not yet heard a good stereo system. They imagine that if the system they heard was expensive, it must therefore represent the state of the art, but in audio I think they could well be wrong. Some time ago, the audio industry and enthusiasts may even have subconsciously sniffed that they were reaching a plateau and begun to stall or reverse progress just to make life more interesting for themselves.

At the science fiction level, people dream of systems that reproduce live events exactly, including the acoustics of the performance venue. Even if this were possible, would it be worth it without the corresponding visuals? (and smells, temperature, humidity, etc.?)

Something like it could probably be achieved using the techniques of the computer games industry: synthesis of the acoustics from first principles, headphones with head tracking, or maybe even some system of printed transducer array wall coverings that could create the necessary sound fields in mid-air if there was no furniture in the room (and knowing the audio industry, it would also supplement the system with some conventional subwoofers). My prediction is that you would try it a couple of times, find it a rather contrived, unnatural experience, and next time revert to your stereo system with two speakers.

On a more practical level, the increasing use of conventional DSP is predicted. We are now seeing the introduction of systems that aim to reduce the (supposedly) unwanted stereo crosstalk that occurs from stereo speakers. The idea is to send out a slightly attenuated antiphase impulse from one speaker for every impulse from the other speaker, that will cancel out the crosstalk at the ‘wrong ear’. It then needs to send out an anti-antiphase impulse from the other speaker to cancel out that impulse as it reaches the other ear, and so on. My gut instinct is that this will only work perfectly at one precise location, and at all other locations there will be ‘residue’ possibly worse than the crosstalk. In fact we don’t seem bothered by the crosstalk from ordinary stereo – I am not convinced we hear it as “colouration”. Maybe it results in a narrowing of the width of the ‘scene’, but with the benefit of increasing its stability. (Hand-waving justification of the status quo, maybe, but I have tried ambiophonic demonstrations, and I was eventually happy to go back to ordinary stereo).

Other predictions include the increasing use of automatic room correction, ultra-sophisticated tone controls and loudness profiles that allow the user to tailor every recording to their own preferences.

Tiny speakers will generate huge SPLs flat down to 20 Hz – the Devialet Phantom is the first example of this, along with the not-so-futuristic drawback of needing to burn huge amounts of energy to do it. Complete multi-channel surround envelopment will come from hidden speakers.

At the hardware fetish end, no doubt some people imagine that even higher resolution sample rates and bit depths must result in better audible quality. Some people probably think that miniaturised valves will transform the listening experience. High resolution vinyl is on the horizon. Who knows what metallurgical miracles await in the science of audio interconnects?

For the IT-oriented audiophile, what is left to do? Multi-room audio, streaming from the cloud, complete control from handheld devices are all here, to a level of sophistication and ease of use limited only by the ‘cognitive gap’ between computer people and normal human users that sometimes results in clunky user interfaces. The technology is not a limiting factor. Do you want the album artwork to dissolve as one track fades out and the new artwork to spiral in and a CGI gatefold sleeve to open as the new track fades in? The ability to talk to your device and search on artist, genre, label, composer, producer, key signature? Swipe with hand gestures like Minority Report? Trivial. There really is no limit to this sort of thing already.

In fact, for the real music lover, I don’t think there is anything left to do. Truth be told, we were most of the way there in 1968.

The basic test is: how much better do you want the experience of summoning invisible musicians to your living room to be? I can’t imagine many worthwhile improvements over what we have now. The sound achievable from a current neutral stereo system is already at ‘hologram’ level; the solidity of the phantom image is total – the speakers disappear. It isn’t a literal hologram that reproduces the acoustics in absolute terms, allowing you to walk around it, of course, but it is a plausible ‘hologram’ from any static listening position, allowing you to ‘walk around it’ in your mind, and it stays plausible as you turn your head.

It isn’t complete surround envelopment, but there is reverberation from your own room all around you, and it seems natural to sit down and face the music. You will hear fully-formed, discrete, musical parts emerging from an open, three dimensional space, with acoustics that may not be related to the space you are listening in. You have been transported to a different venue – if that is what the recording contains. In terms of volume and dynamics, a modern system can give you the same visceral dynamics as the real performance.

And all this is happening in your living room, but without any visuals of the performance – it is music that you are wanting to listen to after all. If the requirement is to experience a literal night at the opera, then short of a synthesised Star Trek type ‘holodeck’ experience you will be out of luck.

You could always watch a high resolution DVD of some performance or the BBC’s Proms programmes, for example, and such visuals may give you a different experience. They will, however, destroy the pure recreation of the acoustic space in front of you because, by necessity, the visuals jump around from location to location, scene to scene in order to maintain the interest level, and your attention will be split between the sound and the imagery. Anyway, a huge TV will cost you about £200 from Tescos these days so that aspect is pretty well covered, too.

The natural partner to a huge TV is multi-channel surround sound. Quadraphonic sound seemed like the next big thing in the 1970s, but didn’t take off at the time. We now have five or seven channel surround sound. Does this improve the musical experience? Some people say so, but that could just be the gimmick factor, or an inferior stereo system being jazzed up a bit. While the correlation between two good speakers produces an unambiguous ‘solution’ to the equations thereof, multiple sources referring to the same ‘impulse’ could result in no clear ‘solution’ – that is, a fuzzy and indistinct ‘hologram’ that our ears struggle to make sense of. Mr. Linkwitz surmises something similar in the case of the centre speaker, plus he finds it visually distracting; with just two speakers, the space between them becomes a virtual blank space in which it is easier to imagine the audio scene. Most recordings are stereo and are likely to remain that way with a large proportion of listeners using headphones. For these reasons, I am happy that stereo is the best way to carry on listening to music.

Can DSP improve the listening experience further? Hardly at all I would say. So-called ‘room correction’ cannot transform a terrible room into a great one, and it doesn’t even transform a so-so one into a slightly better one. It starts from a faulty assumption: that human hearing is just a frequency response analyser for which real acoustics (the room) are an error, rather than human hearing having a powerful acoustics interpreter at the front end. If you attempt to ‘fix’ the acoustics by changing the source you just end up with a strange-sounding source. At a pinch, the listener could listen in the near(er) field to get rid of the room, anyway.

I am convinced that the audiophile obsession with tailoring recordings to the listener’s exact requirements is a red herring: the listener doesn’t want total predictability, and a top notch system shouldn’t be messed about with. As a reviewer of the Kii Three said:

…the traditional kind of subjective analysis we speaker reviewers default to — describing the tonal balance and making a judgement about the competence of a monitor’s basic frequency response — is somehow rendered a little pointless with the Kii Three. It sounds so transparent and creates such fundamentally believable audio that thoughts of ‘dull’ or ‘bright’ seem somehow superfluous.

The user doesn’t have access to the individual elements of the recording. What can be done in terms of, say, reducing the volume of the hi-hats (or whatever) is crude and unnatural and bleeds over every other element of the recording. The only chance of reproducing a natural sound, maintaining the separation between fully-formed elements and reproducing a three dimensional ‘scene’, is for the system to be neutral. When this happens, the level of the hi-hats likely just becomes just part of the performance. Audiophiles who, without any caveat, say they want DSP tone controls in order to fiddle about with recordings have already given up on that natural sound.

In summary, I see the way music was ‘consumed’ 40 or even 50 years ago as already pretty much at the pinnacle: two large speakers at one side or end of a comfortably-furnished living room, filling the space with beautiful sound – at once combining compatibility with domestic living and the ability to summon musicians to perform in the space in a comprehensible form that one or several people can enjoy without having to don special apparatus or sit in a super-critical location. And the fitted carpets of those times were great for the acoustics!

All that has happened in the meantime is just the ‘mopping up’ of the remaining niggles. We (can) now have better performance with respect to distortion, frequency response, dynamic range, and a more solid, holographic audio ‘scene’; no scratches and pops; instant selection of our choice of the world’s total music library. The incentives for the music lover to want anything more than this are surely extremely limited.

The First CD Player

sony cdp-101There’s an amazing online archive of vintage magazines that I have only just begun rummaging through. I was pleased to see this 1982 review of the Sony CDP-101, the first commercial CD player. The reviewer gets hold of a unit even before they go on sale commercially, saying:

I feel as though I am a witness to the birth of a new audio era.

This was the first time that the public had encountered disc loading drawers, instant track selection, digital readouts and digital fast forward and rewind, so he goes into great detail on how these work.

And at that time, the mechanics of the disc playing mechanism seemed inextricably linked with the nature of digital audio itself, so, after reading the more technical sections of the article, the reader’s mind would be awhirl with microscopic dots, collimators and laser focusing servos – possibly not really grasping the fundamentals of what is going on.

Audio measurements are shown, though, and of course these are at levels of performance hitherto unknown. (He is not able to make his own measurements this time, but a month later he has received the necessary test disc and is able to do so).

As I write these numbers, I find it difficult to remember that I am talking about a disc player!

Towards the end, the reviewer finally listens to some music. He is impressed:

I was fortunate enough to get my hands on seven different compact digital disc albums. Some of the selections on these albums were obviously dubbed from analog master tapes, but even these were so free of any kind of background noise that they could, for the first time, be thoroughly enjoyed as music. There’s a cut of the beginning of Also Sprach Zarathustra by Richard Strauss, with the Boston Symphony conducted by Ozawa, that delivers the gut -massaging opening bass note with a depth and clarity that I never thought possible for any music reproduction system. But never mind the specific notes or passages. Listening to the complete soundtrack recording of “Chariots of Fire,” the images and scenes of that marvelous film were re- created in my mind with an intensity that would just not have been possible if the music had been heard behind a veil of surface noise and compressed dynamic range.

He talks about

…the sheer magnificence of the sound delivered by Compact Discs

and concludes:

…after my experiences with this first digital audio disc player and the few sample discs that were loaned to me, I am convinced that, sooner or later, the analog LP will have to go the way of the 78 shellac record. I can’t tell you how long the transition will take, but it will happen!

A couple of months later he reviews a Technics player:

Voices and orchestral sounds were so utterly clean and lifelike that every once in a while we just had to pause, look up, and confirm that this heavenly music was, indeed, pouring forth from a pair of loudspeaker systems. As many times as I’ve heard this noise -free, wide dynamic -range sound, it’s still thrilling to hear new music reproduced this way…

…the cleanest, most inspiring sound you have ever heard in your home

So here we are at the very start of the CD era, and an experienced reviewer finding absolutely no problems with the measurements or sound.

In audiophile folklore, however, we are now led to believe that he was deluded. It is very common for audiophiles to sneer about the advertising slogan “Perfect Sound Forever”.

Stereophile in 1995:

When some unknown copywriter coined that immortal phrase to promote the worldwide launch of Compact Disc in late 1982, little did he or she foresee how quickly it would become a term of ridicule.

But in an earlier article from 1983 they had reviewed the Sony player saying that with one particular recording it gave:

…the most realistic reproduction of an orchestra I have heard in my home in 20-odd years of audio listening!

…on the basis of that Decca disc alone, I am now fairly confident about giving the Sony player a clean bill of health, and declaring it the best thing that has happened to music in the home since The Coming of Stereo.

For sure, there were/are many bad CDs and recordings, but it is now commonly held that early CD was fundamentally bad. I don’t believe it was. I would bet that virtually no one could tell the difference between an early CD player and modern ‘high res’.

Both magazines seemed aware that their own livings could be in jeopardy if ‘all CD players sound the same’, but I think that CD’s main problem was the impossibility of divorcing the perceived sound from the physical form of the players. 1980s audio equipment looked absolutely terrible – as a browse through the magazines of the time will attest.

Within a couple of years, CD players turned from being expensive, heavy and solid, to cheap, flimsy and with the cheesiest appearance of any audio equipment. They all measured pretty much the same, however, regardless of cost or appearance. Digital audio was revealed to be what it is: information technology that is affordable by everyone.

This, of course, killed it in the eyes and ears of many audiophiles.

A new listening room

concords in extension 1a

Here are my KEF Concords in their new home. Yes, a room whose walls are 1/3 glass! Since that photo was taken, floor-to-ceiling curtains have been installed:

The room is about 6m x 3.5m and has a ceiling height of 2.4m. Apart from the glass, the walls and ceiling are plasterboard, and the concrete floor is carpeted wall-to-wall. There’s a bed and various bits of junk in the room.

To some people it may look like an acoustic nightmare, but it’s actually sounding good. I’ve got the speakers wider apart than shown in the photo. I did originally set the bass -3dB point at 38 Hz, but I think that was too low and it is now at 44 Hz. Apart from that, I haven’t made any provision for ‘room correction’ as such. I am using 5th order crossover filters and the depth of the baffle step compensation curves has been set by ear.

I am pleased to find that I am achieving the desirable effect of the end of the room appearing as a clear window (literally and metaphorically!) onto the performance, particularly ‘purist’ classical recordings. There’s a nice level of clean bass and great imaging and detail higher up. It seems to work just fine with the curtains open or closed – when open the curtains are bunched up in the corners. Maybe looking out through the window does enhance the perception of front-to-back depth of the recording.

Audiophile Demo Music

 

lizardIn shops that sell televisions, they often play some sort of ‘showreel’ of spectacular scenes; you know the type of thing: ultra-detailed night time cityscapes, ultra-saturated lizards, ultra-contrasty arctic wildlife, and so on. You realise that it is impossible to see any real difference between the televisions with these scenes. They are ‘impressive’, but only at the most superficial level of what a television can display. Basically, any modern television can display them, with the only differentiators being size and absolute brightness. It always seems to me that the only way I can tell if a TV is any good is to watch a local news programme or something like that – not zero ‘production values’, but something that is relatable to everyday life.

Does something similar happen with audio?

When writing this post, I vowed to myself to search for a report of an audio show demo track, and to use the first track I came across as my example – of course I would have quietly forgotten that vow if it hadn’t illustrated my point fairly well, but as it happens, I think it does. The track is by Malia, and is called I Feel It Like You.

Absolutely no criticism is implied of the track, nor its production which is exemplary for this kind of music. But as an audio demo track?

Listening to it on my laptop, it seems to me to be an ‘in-your-face’ studio recording, built from a fairly sparse assemblage of pristine layers, each of which has been processed, compressed and equalised. The vocals are crystal clear and close up, mixed with a carefully-balanced amount of ‘Large Hall’ reverberation. The backing features plenty of detail, with lots of staccato, sampled(?) percussion rhythms and bass.

I think that this track would sound superficially impressive on any system – it even sounds good on my laptop.

What it is missing, if you ask me, is any connection with the organic, natural acoustics we encounter every day. It is like those uber-detailed images used for TV demos; the sound is highly-detailed and everything is at peak contrast and saturation. Such tracks are very common in audio demonstrations.

An alternative staple of audiophile demonstrations is ‘jazz’… I’m not sure what the appeal of this is (as a demonstration). I suspect it is because it often seems like an antidote to over-production – although jazz can still be over-produced. But again, as the potential customer, I don’t think it is telling me very much about the system’s capabilities. Old recordings of jazz are like grainy monochrome pictures, and modern recordings are still showing a ‘scene’ that is ‘smokey’ and sepia-toned (which I am sure is the intention). The style of music and the instrumental line-up (e.g. continuous brushed snare..?) means that I am often not hearing clear delineation between the instruments nor much in the way of transients and dynamics. (Or maybe I just don’t like jazz particularly and cannot engage with it, in which case ignore my objections…).

Just looking through some of the tracks that I might ‘demo’ my system with, one thing strikes me: they usually feature a bit of ‘messiness’. They may, or may not, have been put together in a studio using overdubbing, but the individual layers are a bit raw, organic, and recorded from a bit of a distance, so the room’s natural acoustics are audible. This possibly masks a bit of the pristine detail, but there’s enough there to verify that the system can do detail, anyway. When a short sound stops, and the reverberation remains, the contrast between the two can be particularly revealing. In photographic terms, the image covers all shades of grey and there’s still detail in the shadows; it’s not pushed into excessive contrast, nor selected or processed to be super-detailed. I am not even advocating massive ‘dynamics’ most of the time, which some people cite as proof of a system’s chops. As I will mention later, there are some specific classical tracks that might be played in order to put the system’s dynamic capabilities beyond doubt, anyway!

My favoured demo tracks are not just a single mic recording of a school concert, of course! They have been put together with some high ‘production values’.

It is worth perhaps listing the aspects of the system we might want to show off, or listen to if we are thinking of buying it.

  • frequency response: it is good if the track covers a wide spectrum of frequencies with equal weighting – not just bass and treble. A problem with many a system, would be fixed bumps and dips in the frequency response. These are almost impossible to hear against a recording that also has fixed bumps and dips in its ‘frequency response’. For example, a solo voice or a string quartet, or a piano. All of these are generated by resonant systems characterised by a formant, or a group of similar formants.  Some studio recordings are also augmented with fairly aggressive parametric equalisation of the individual layers in order to make them sound even more detailed. It is only when we hear many different natural musical sources playing in varying combinations that we assemble enough ‘simultaneous equations’ to work out whether the system is neutral or not.
  • bass: of course we want to demonstrate this! Deep organ notes, kick drums, symphony orchestras in natural acoustics are going to show this off well. The best bass does not have ‘one note’ quality; it engages somewhat with ‘room gain’ in order to extend all the way down to below audibility; it starts and stops quickly, hitting you in the chest (the kick drum will show this). In other words, sealed not bass reflex…
  • distortion: a sine wave would show up harmonic distortion, and several musical sources all playing at the same time would show up the resulting intermodulation distortion. A single voice will not really show it, nor percussive sounds. A choir would probably be a pretty good demonstration of low distortion, as would a symphony orchestra playing a varied selection. Less good would be girl-and-guitar, a string quartet, or a ‘world music’ drumming ensemble.
  • imaging: the really great demo, in my opinion, is when the stereo speakers produce a complete 3D audio ‘scene’. It may be an “illusion” as some people are very keen to point out, and not a perfect holographic reproduction especially if the recording was created with multiple mics and overdubs in a studio, but it is very compelling. Some classical recordings are made in purist fashion and do create a very convincing sense of 3D space – not just left-right imaging, but also a sense of distance. Imaging depends at least on low distortion and accurate correlation between left and right speakers, implying (I would say) a requirement for accurate reproduction of phase and timing. Some people would claim that absolute reproduction of phase isn’t important as long as both channels are well matched. I think this is special pleading based on the performance of traditional systems; I sometimes think that the people who are very keen to ‘dis’ imaging probably have very expensive systems based on valves, vinyl and passive crossovers…
  • power: achieving high volume isn’t usually a problem, but we want the system to behave uniformly well at all volumes. I suggest that the way this would be made obvious would be when a musical performer or ensemble plays continuously and naturally between quiet and loud – with minimal dynamic compression being applied. This is different from demonstrating a system playing a less dynamic recording with the volume control low and then high. As the Fletcher Munson curves show, there is only one volume at which we perceive a sound with the correct frequency response: its natural volume. If the system does something peculiar as the volume increases, it will be much more obvious if we are listening at a fixed volume that is closer to the ‘real’ volume at which it was recorded.

Of course, recommending tracks is a bit pointless, because the track’s ‘demo’ qualities are combined with musical taste – and I think you need to like the music in order to engage fully with what you are hearing and to know how it’s going to sound with ‘your’ music. Nevertheless, here’s a few tracks out of hundreds that I tentatively suggest would reveal a system’s attributes (no accounting for Youtube’s sound quality) and are the sort of thing I would want to listen to in order to get some idea of whether a system was any good.

Sufjan Stevens, Jacksonville – not a familiar act to me, but this track is ‘big’, has great bass and enough rawness to hear that the system sounds ‘natural’.

Elton John, Rocket Man – a beautiful, rounded studio recording with a great sense of space (so to speak).

Neil Young, Double E – very simple rock track that doesn’t sound over-produced.

Khachaturian Symphony Number 3 – a *massive* symphonic recording with huge pipe organ and 15 trumpets (apparently). If you play this loud, the end is very loud!

Arvo Part, Credo, for Piano Solo, Mixed Choir and Orchestra – possibly some of the most dynamic, contrasting classical music you will encounter.

(Maybe these classical performances are a bit too dynamic for everyday listening, but if you really want the demo to show what the system is capable of..!)

A less intense classical recording with some great imaging, space and some revealing bass is this one:

It’s An American in Paris by Gershwin, performed by the LA Philharmonic under Zubin Mehta – not sure if the Youtube version is the same as the CD version I listen to.

The Sound of a Symphony Orchestra

Last night I went to a symphony concert: Shostakovich’s 10th, preceded by Prokofiev’s Piano Concerto No. 2 at the West Road Concert Hall, Cambridge.

west roadWe were sitting in the second row from the front – so quite close to the piano. I wish I had taken a photograph, but I was so paranoid about my phone ringing mid performance that I left it turned off! The image above shows the empty venue.

We really enjoyed the concert. Chiyan Wong is an amazing piano soloist, and CCSO were spectacular. The sound was formidable from a large orchestra, and we got to hear the fairly new Steinway grand in great detail – the piano was removed during the interval, for the Shostakovich that followed.

Now, I do often listen to this sort of music with my system, but this was the first time I had been to a concert to hear this specific Russian ‘genre’. Of course I couldn’t help but make a mental comparison of the sound of the real thing versus the hi-fi facsimile that I am used to, as I was listening. And you know what? I have to say that a good hi-fi gives a pretty good rendition of the real sound.

The real thing was very loud, but also very rich – I have observed that ‘painfully loud’ is more a function of quality than volume; you need good bass to balance the rest of the spectrum. So this was very loud, but at no time painful. Bass from the orchestra was wonderful, but didn’t take me by surprise – I sometimes hear such bass from my system. (It did take me by surprise the first time I heard it from a hi-fi system, however!).

Some people cite piano as being the most difficult thing for a hi-fi system to reproduce. I don’t know where they get that from: I loved the sound of the piano, and I think a good system can reproduce it fairly easily.

I was struck by the homogeneity within the different sections of the orchestra. Listening to a recording of just a piano, or just the violins, would not tell you very much about an audio system. It is only when you hear a combination of the piano, the violins and the brass, say, that any ‘formant’ (i.e. fixed frequency response signature) within your system would show up.

As discussed previously, ‘imaging’ of the orchestra was not as pin sharp as you get in some recordings, but many purist recordings portray the true effect quite accurately. The width of the ‘soundstage’ of a stereo system is more-or-less right, and the room you are listening in enhances the recording’s ‘ambience’ around and behind you.

Of course the concert is a very special experience. The stereo version isn’t always as deep, open and spacious, nor is the envelopment as complete but, all in all, I think if you sit down in the right frame of mind to listen to a fine orchestral recording using a good hi-fi system, you are getting a very reasonable impression of the sound, excitement and visceral quality of the real thing. And that really is quite an amazing idea.

Should concert halls use “assisted resonance”?

I read recently a very interesting article on the sonic deficiencies of some major classical concert halls and the possibility of using “assisted resonance” a.k.a. electronics and DSP to improve their reverberation characteristics.

It seems that there have been some expensive acoustic disasters over the years, where new concert halls have failed to live up to expectations. London’s Royal Festival Hall, which opened in 1951, is one of them, and it does seem a shame that a very expensive building designed and built purposely for music, has acoustics where performers apparently “lose the will to live”.

The science of concert hall acoustics has become better understood recently, but even if the hall works as predicted, there is no such thing as a one size fits all characteristic that is optimal for all types of speech and music. Even if there were, the number of people in the seats for a particular performance has a significant effect on the nature and length of reverberation. Starting from scratch, a good strategy might be to go all out for reverberation, with optimal dimensions and hard surfaces that could be covered with retractable curtains as required, but for many existing halls it is too late; they were built with the wrong materials and have the wrong dimensions, and it would be too costly to modify them.

And this is where electronics can supposedly come to the rescue. Microphones can be placed near to the stage and around the auditorium, and their output processed with DSP and fed to loudspeakers. Acoustically, the system can subjectively give the impression of changing the materials the hall is made of, or its dimensions. There are various commercial and experimental systems, and their use seems to be quite widespread.

Acoustic feedback from the speakers to the microphones is a factor that has to be managed, and is a limitation on the designers’ ability to create any response they desire (although modern DSP techniques reduce the feedback problem, but possibly with audible side effects). It was also the actual basis of one of the earliest attempts at electronic reverberation, known as “assisted resonance”, which was used in the Royal Festival Hall in the 1960s.

So reverberation enhancement ‘works’, but should it be used? Well, as a person who is pro the use of DSP in audio systems, I find myself unable to embrace it enthusiastically for classical concert halls, and I would much prefer to remain in ignorance if it is being used in any hall I might go to! I wonder how the majority of audiophiles would feel about it? Personally, I think I know too much about the reality of electronics and the people who inhabit that world! I don’t attribute the characteristics of art, craftsmanship, music and musicians to audio equipment. The reality is that audio equipment is created by technicians who are not steeped in art and have not served a musical craftsman’s apprenticeship. Do they have any business in a classical concert hall?

Electronic reverberation enhancement would no doubt be a mixture of approaches: custom design by computer programmers and acousticians in offices, and then physical construction of the system using standard microphones, amplifiers, DSP units, speaker drive units and custom MDF enclosures crammed into whatever corners and spaces of the hall that were convenient. Gaffa tape and the wearing of heavy metal T-shirts would be involved in the installation.

No one can say for sure what the ideal hall response should be, and even if they could it wouldn’t be achievable in every seat of the house. By definition we would be retro-fitting the system into an existing hall so would not have free rein to place speakers and microphones in all the optimal (if we even knew how to define optimal) locations. I have no doubt that, given a full 3D model of the auditorium, the acoustics with and without the electronic system could be simulated and plotted quite accurately, but this wouldn’t tell us the optimum settings for the system in order to maximise performance throughout the auditorium. If we felt able to specify criteria for “performance” then we could set a computer running with the task of finding the best compromise using simulated annealing or similar. We could go for best possible performance in the most expensive seats and not worry about everywhere else, or go for the best average performance throughout the auditorium, say. But notice what would have happened there. The future sound of classical music performances would have been set by:

  1. Arbitrary placement of transducers
  2. Sparse coverage of transducers
  3. Imperfect transducers
  4. An incomplete model of the auditorium and all possible configurations of stage, audience and placement of performers
  5. An incomplete simulation of the acoustics
  6. Arbitrary criteria for what makes ‘good’ acoustics
  7. Arbitrary criteria for distributing ‘good’ performance throughout the auditorium

You might say that something very similar would have occurred during the design of any modern acoustic-only concert hall: computer simulations and the setting of arbitrary criteria. But I would point out one crucial difference: a physical space and its acoustics form an entirely consistent system where the sound at any point is the sum of the direct sound and multiple delayed reflections. Even if the acoustics are not ideal they are consistent within themselves, and by moving throughout the space and sampling the response to impulses generated from known positions, multiple viable models could be constructed of the auditorium, which would gradually refine down to a single viable, consistent model. Electronically-generated acoustics cannot do this throughout the whole space. That is, they cannot be guaranteed to simulate a building that actually exists – certainly not at every position in the hall. Maybe the stationary human listener cannot hear the inconsistency, or maybe they can, but I don’t think it would be possible to guarantee a totally convincing effect at every point in the auditorium – unlike the case of an acoustic-only space no matter how flawed.

Other inconsistencies would include:

  • A ‘cognitive dissonance’ between the dimensions and materials of the hall and its sound (maybe it is obviously constructed from soft materials yet sounds like a stone church with different dimensions to the actual space)
  • A disconnect between the auditorium’s acoustic effect on sounds made by the audience itself (yes, coughing probably!) and its different apparent effect on the sounds made by the performers.

I realise that none of this may be the huge problem I am making it out to be. It’s just that I am wary of hype, and sceptical of the abilities of technicans! If a person who is adept at audio installation, mathematics or computer software tells me that they possess special powers enabling them to create the world’s finest concert hall acoustics with a few microphone capsules and polymer cones then I am not wholly convinced. Even if they are experts in their field (and this field could be very relevant like synthesising acoustics from first principles within 3D computer games) it does not automatically mean they can really do it.

The way I envisage the installation, technicians with laptops would pore over colourful charts on their screens, talking about “waterfall plots” and setting the system up to their own best guesses based on the methods they often use in sports stadiums, pop venues and shopping malls. Driving home at night they would be playing the latest Rihanna album on their car stereo, not Harrison Birtwistle; I would expect meaningful communication with the concert hall people to be limited simply because of the gulf of understanding between them.

In use it would become apparent that there were rough edges to the sound, but the concert hall people would be incapable of describing it in a way that could be understood by the technicians. Despite repeated attempts the sound would never be great. In short, the hall would become the offspring of two cultures that do not understand each other.

Over time the system would degrade. The microphones in very awkward-to-get-to places would gather thick layers of dust, changing their response. Occasionally, mysterious sounds caused by a spider living in one of the mics would be heard but never solved. Cables would be damaged by roofing contractors and repaired using Blu-tack and sellotape. Sonic anomalies like a metallic ringing particularly audible from rows C to E in the balcony would never quite be fixed. At some point the suppliers of the system would lose the original configuration files, making modifications impossible. Occasionally the system would pick up mobile phone interference. Yes, this is how I imagine such a system would be.

Would the system have fixed settings, or require a man at a mixing desk out in the auditorium to make proper adjustments for each performance just like a pop venue?

And then there are the ‘philosophical’ implications. I think that when we go to a concert we engage in some ‘suspension of disbelief’. Of course deep down we know that the hall is purposely-designed to sound good, and built for profit, and that the performers would rather be at home watching Game of Thrones that night. But we like to imagine that we have stumbled serendipitously upon a cultural happening with like-minded people in a magnificent hall built primarily as a gathering place, witnessing a group of performers doing what they do for the love of it. What happens there is spontaneous and not entirely predictable. Maybe the crowd will love the performance and the performers will feed off that reaction, or maybe they won’t. Maybe the organ will resonate in tune with the hall, or maybe because of atmospheric conditions and a particularly full house tonight it will be different and lend a new twist to the piece – without anyone analysing it of course.

Or at least that’s how I fondly imagine it. For me, electronic reverberation adds a new layer of ‘contrivance’. An analogy would be the use of electronics in a sports car to enhance the sound of the engine as heard by the driver (oh yes, they do that these days). There’s something not quite right about the direct, calculated, artificial ‘enhancement’ of something that is meant to be a fortunate by-product of something else. Even worse if it is created using technology in another ‘domain’ so that it is impossible to rationalise it as a “power valve” or whatever. Besides, it can never be perfect enhancement, for practical reasons such as that it is impossible for the car’s stereo speakers to create the low frequency vibrations that should accompany the harmonics we’re hearing. At some level, consciously or not, we may detect that the sound and physical sensations are not consistent with each other and decide that the whole thing is ‘fake’.

And then, if the performance is recorded there’s the idea that the recording I am listening to is a combination of real acoustics and some technician’s idea of good acoustics reproduced from imperfect speakers and then re-recorded by the mic! For better quality should the reverb instead be injected directly into the recording as a separate track? And maybe, just as we now recoil in horror at dated effects that were once thought to be timeless classics (e.g. gated reverb in the 1980s), will we only understand the true reality of having allowed young technicians to play with the hall’s acoustics when we listen to it again, decades later? Not a nice thought. But at least when the recording is re-mastered in years to come we can replace the reverb track, and its now-discredited algorithm, with the latest Steinberg Carnegie Hall impulse response – even though the performance was recorded in the Albert Hall.

As you can tell, I’m not keen.


UPDATE 20/06/15

Saw a link to a New Yorker article about someone’s experience with the Meyer Sound Constellation system. Reading it, I began to feel embarrassed: maybe my doubts about such systems are ill-founded, and in fact they are pretty much perfect. Maybe there really is a technical wizard who understands precisely how to solve this problem.

He clapped his hands; the sound resonated handsomely. Then he signalled for the power to be turned off. Suddenly, the clap was clipped and lifeless. The crowd gasped and applauded…

…it demonstrated the Meyers’ ability to conjure a plausible performance space. I was particularly struck by the sound of the tenor Nicholas Phan, in the Britten; the singer’s tensile strength and tonal colors came through intact. “It feels like a completely natural and real acoustic,” Phan told me afterward. “It even changes and feels different depending upon how full the audience is.”

But at the end, the author confirms what I might have expected:

All the same, I was never entirely convinced by the string timbre, especially the cellos and the double-basses. At full force, they had a slightly puffy, plastic quality—a familiar handicap of amplification that Meyer technicians haven’t yet overcome.

There is something philosophically disquieting about the Meyers’ work, as there is in any digital makeover of reality. Both at Oliveto and at SoundBox, the Constellation process never seemed obviously fake or too good to be true, and yet I had a sense of being ensconced in an audio cocoon. In the concert setting, I missed the thrum of floorboards under my feet—the full physical tingle of reverberation. Traditionalists will insist that there is no substitute for a first-class hall, and they will be right.

The Joy of Mozart?

mozartAnother night, another programme on BBC4. In The Joy of Mozart, presenter Tom Service revealed himself to be very, very keen on Mozart having been transfixed upon first hearing a piece, aged 7. We visited various houses and flats that Mozart lived in, and even sat at some of the keyboards he played. We saw the tacky Mozart tourist industry in Salzburg. We talked with, and enjoyed the playing and singing of, famous musicians who simply love Mozart. Paul Morley is a convert too.

But sorry, I just don’t get it. There’s nothing that I like about it. I don’t hear anything in it so astounding that I forget to breathe (as one violinist finds). Maybe when it was new this music daringly broke rules and audiences rioted and started slashing the seats, but I have heard ‘Won’t Get Fooled Again’ and ‘I am the Walrus’, and the music of Vaughan Williams, Benjamin Britten, Stravinsky etc. etc. To me, Mozart just sounds prissy, mannered and lightweight. Am I alone in this?