The Logic of Listening Tests

Casual readers may not believe this, but in the world of audiophilia there are people who enjoy organising scientific listening tests – or more aptly ‘trials’. These involve assembling panels of human ‘subjects’ to listen to snippets of music played through different setups in double blind tests, pressing buttons or filling in forms to indicate audible differences and preferences. The motivation is often to use science to debunk the ideas of a rival group, who may be known as ‘subjectivists’ or ‘objectivists’, or to confirm the ideas of one’s own group.

There are many, many inherent reasons why such listening tests may not be valid e.g.

  • no one can demonstrate that the knowledge you are taking part in an experiment doesn’t impede your ability to hear differences
  • a participant who has his own agenda may choose to ‘lie’ in order to pretend he is not hearing differences when he, in fact, is.
  • etc. etc.

The tests are difficult and tedious for the participants, and no one who holds the opposing viewpoint will be convinced by the results. At a logical level, they are dubious. So why bother to do the tests? I think it is an ‘appeal to a higher authority’ to arbitrate an argument that cannot be solved any other way. ‘Science’ is that higher authority.

But let’s look at just the logic.

We are told that there are two basic types of listening test:

  1. Determining or identifying audible difference
  2. Determining ‘preference’

Presumably the idea is that (1) suggests whether two or more devices or processes are equivalent, or whether their insertion into the audio chain is audibly transparent. If a difference is identified, then (2) can make the information useful and tell us which permutation sounds best to a human. Perhaps there is a notion that in the best case scenario a £100 DAC is found to sound identical to a £100,000 DAC, or that if they do sound different, the £100 DAC is preferred by listeners. Or vice versa.

But would anything actually have been gained by a listening test over simple measurements? A DAC has a very specific, well-defined job to do – we are not talking about observing the natural world and trying to work out what is going on. With today’s technology, it is trivial to make a DAC that is accurate to very close objective tolerances for £100 – it is not necessary to listen to it to know whether it works.

For two DACs to actually sound different, they must be measurably quite far apart. At least one of them is not even close to being a DAC: it is, in fact, an effects box of some kind. And such are the fundamental uncertainties in all experiments involving the asking of humans how they feel, it is entirely possible that in a preference-based listening test, the listeners are found to prefer the sound of the effects box.

Or not. It depends on myriad unstable factors. An effects box that adds some harmonic distortion may make certain recordings sound ‘louder’ or ‘more exciting’ thus eliciting a preference for it today – with those specific recordings. But the experiment cannot show that the listeners wouldn’t be bored with the effect three hours, days or months down the line. Or that they wouldn’t hate it if it happened to be raining. Or if the walls were painted yellow, not blue. You get the idea: it is nothing but aesthetic judgement, the classic condition where science becomes pseudoscience no matter how ‘scientific’ the methodology.

The results may be fed into statistical formulae and the handle cranked, allowing the experimenter to declare “statistical significance”, but this is just the usual misunderstanding of statistics, which are only valid under very specific mathematical conditions. If your experiment is built on invalid assumptions, the statistics mean nothing.

If we think it is acceptable for a ‘DAC’ to impose its own “effects” on the sound, where do we stop? Home theatre amps often have buttons labelled ‘Super Stereo’ or ‘Concert Hall’. Before we go declaring that the £100,000 DAC’s ‘effect’ is worth the money, shouldn’t we also verify that our experiment doesn’t show that ‘Super Stereo’ is even better? Or that a £10 DAC off Amazon isn’t even better than that? This is the open-ended illogicality of preference-based listening tests.

If the device is supposed to be a “DAC”, it can do no more than meet the objective definition of a DAC to a tolerably close degree. How do we know what “tolerably close” is? Well, if we were to simulate the known, objective, measured error, and amplify it by a factor of a hundred, and still fail to be able to hear it at normal listening levels in a quiet room, I think we would have our answer. This is the one listening test that I think would be useful.


2 thoughts on “The Logic of Listening Tests

  1. Unfortunately, double blind listening tests are the only way to establish whether subjects can hear a difference or not between various pieces of equipment and music file types etc. Of course some people could lie by saying that they can hear no difference but most people are honest.

    I do not believe any subjective claim on hi-fi forums or in hi-fi magazines. Those claiming that they can hear the differences between different forms of cable, power supplies or between cd quality and so called hires, should prove that they can do so by subjecting themselves to peer reviewed double blind tests to prove their claims. Otherwise they should either shut up or strongly qualify that their claims are only opinions and not facts.

    There is no doubt in my mind that equipment manufacturers etc. are making such tests but they probably never reveal the results. If some golden eared listeners were able to hear differences between music file types and resolution then there would be adverts emblazoned everywhere.


    1. Hi Trevor

      “…double blind listening tests are the only way to establish whether subjects can hear a difference or not between various pieces of equipment and music file types etc.”

      That has got to be true, but not necessarily directly..? I would say that in a case where we can measure or calculate a deviation from the equipment’s stated function (e.g. a DAC or a cable, or the whole system) and can show that the error lies way below the threshold of hearing (established using listening tests and other science in the past), we don’t need to do a fresh listening test.

      We could also simulate the error, exaggerate it, and listen to it in isolation, or add it to music and listen to it. Far more casual listening tests could then confirm that the error is a red herring.

      No system will ever be error free, and I think that we lose all sense of scale when talking about the errors due to digital audio resolution, jitter etc. In order to understand these phenomena we have to draw pictures of them, and in order to see the errors we have to exaggerate them. If we improve the measurements by a factor of 1000 over the years, the picture stays the same! If the only way to verify each new system is to perform a scientific listening test, nothing will ever get done.

      “There is no doubt in my mind that equipment manufacturers etc. are making such tests but they probably never reveal the results”

      I am not convinced of that at all! It would be so difficult and expensive, and almost always fruitless, that I think such tests hardly ever occur. At the end of the day, despite all the listening tests in audiophile history, the aim of an audio system remains basic ‘linearity’. Using the right chips, and checking the results with an oscilloscope, multimeter and, in this day and age, a PC-based measurement system, the manufacturer can get to the heart of the matter directly. It shows that their system is more-or-less the same as everyone else’s. (There are exceptions: ‘maverick’ designers believe their Frankenstein systems’ inferior measurements are a sign of genius and musicality). It is then time for the creative blurb, fancy box and logo to do their psychological work on the customers!


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s