That Floyd Toole lecture

I watched this because so many people are talking about it – and I even took notes.


The overall thrust of the lecture is, I would say, “objectivity works”. We can make measurements of the hardware, judge them against some fixed criteria, and then demonstrate that these correlate with real, human preferences in blind tests. Hurrah! The age-old debate is over, and we can improve our new speaker designs by building them to maximise their objective scores in the full knowledge that this would correlate with human preferences.

However, I am not totally convinced by the criteria that were specified in the lecture: it seemed to me that there might be gaps in the argument and some circularity.

What he showed:

  • Sighted listening tests can be flawed (thereby implying: all sighted listening tests are unreliable).
  • Some measurements can be carried out on speakers in an anechoic chamber (dubbed ‘Spin-o-rama’) and munged together to create a performance index related to flatness of frequency response and smoothness of off-axis response. Transient response is not a factor. At all.
  • In listening tests, speakers with the ‘best’ Spin-o-rama score are usually preferred by listeners over the opposite (implied: all else being equal).
  • Mono allows maximum discernment of difference, and does not contradict stereo listening results in the above tests (implied: therefore mono should be used for all listening tests)
  • Trained professional listeners give the ‘statistically healthiest’ range of scores, and do not contradict ordinary listeners in the above tests (implied: therefore trained professionals should be used for all listening tests)

What he didn’t show:

  • That it is valid to use the Spin-o-rama score in reverse i.e. as a tool for designing a speaker. He implies it is, but does not prove that a poor speaker could not be designed that achieves an exemplary Spin-o-rama score.
  • That transient response doesn’t matter – it is simply ignored. The speakers tested may have had good transient responses, or not, but as most of them were of conventional design they may all have been much of a muchness.
  • That various speaker technologies are inherently better or worse than others i.e. no view on whether sealed cabinets are better than bass reflex, or active crossovers better than passive – and his performance index is indifferent to this, assuming that flat steady-state frequency response is all that matters.
  • That mono speakers and trained professionals are the best choice for all listening tests.

It is possible to produce different colourations related to phase shifts while still producing a perfect frequency magnitude response (the drivers may have their phases matched perfectly throughout the crossover but the phase is shifted relative to other components in the signal). Similarly, bass reflex configurations distort the time domain response while maintaining a perfect steady state sinusoidal magnitude response. Dr. Toole’s tests don’t address these factors.

I have no doubt that flat frequency response and smooth off-axis response are essential, as he says, but might there be more to it than just that? Any unexplained deviations between the listening tests and the measurements (it isn’t a perfect correlation) could be explained by a multitude of factors including the speaker’s transient response which, after all, is a straightforward difference between what was recorded and what the speaker emits – it is just that someone around 1936 declared that ‘phase doesn’t matter’. Until recently it has not been possible to verify this, because it was not possible to produce a high quality output with close to perfect phase. Comparing different speakers all of which have phase/time distortion and other problems, and finding that listeners cannot tell them apart (in mono using someone’s idea of ‘typical’ music), does not tell us that a speaker without those distortions would not sound better.

Correlation is not causation, but people are talking about the Harman method as if it is. So, if I were a speaker designer doing things by the Toole book, I would always use bass reflex without thinking, as this would have no effect on the Spin-o-rama score but would result in a smaller box. And I would be supremely relaxed about crossover design, ensuring only that it matched the phases of drivers through the crossover. Phase correction and sealed enclosures wouldn’t get a look-in because they offer nothing extra in terms of the Spin-o-rama score but cost more to manufacture. 

My opinion is of no consequence, of course, but there are some serious people who do suggest that transient response matters, and it would have been nice if the guru of gurus could have mentioned it, if only to dismiss it with reasons.


6 thoughts on “That Floyd Toole lecture

  1. ” it is just that someone around 1936 declared that ‘phase doesn’t matter’. ”

    Hi, there is a bit more evidence behind the assumption than the above. 😉 You are I assume referring to “frequency-independent phase error”, the so-called all-pass phase error, because as we all know, phase errors that are associated with an amplitude change, for example a crossover on a loudspeaker driver, will have an effect on the frequency response, and are readily audible because the frequency response has changed.

    So, looking at all-pass phase errors only, are they audible? In extremis, the answer is definitely yes. If you introduce a 360,000 degree phase error at 1 kHz, the musical playback will be delayed by one second at 1 kHz, so that is obviously audible! The question needs to be restricted to practical realities, say no more than the few hundreds of degrees that are common in some hifi components.

    The reason that transient response is not important, independently of its effect on the frequency response, is because listening tests using all-pass filters show that all-pass phase distortion is largely undetectable.

    ‘Largely’ undetectable? That’s right, listening tests do show the ability to detect all-pass phase changes, under certain conditions, as follows:
    – audible using special test signals and headphones.
    – inaudible (in tests conducted to date) using music and headphones.
    – even less audible, by quite a lot, using loudspeakers, because room reflections blur things.

    I hope this is of some interest. cheers

    P.S. a couple of useful links:
    – the David Clark 1981 article, posted on the ethanwiner site.
    – a literature summary by ‘mark’, posted on audioholics, titled “Human Hearing – Phase Distortion Audibility Part 2”.


  2. Thanks Granite Slack. Yes, I have read of some of these listening tests (and thanks for the links – I’ll have a good look at them later) but I do remain questioning of what they show. As the chap from Bodzio Software said in one paper, he had to un-learn his previous listening ‘skills’ once he encountered a DSP-corrected system. Until then he had been listening only for flat frequency response (and low distortion, the ability to play loud etc.) so was effectively deaf to the finer timing/transient aspects. His commercial interests aside, that is possible isn’t it? Floyd Toole’s trained listeners (and everyone else) will, presumably, have learned their skills mainly listening to standard phase-mangling speakers.

    At one point a while ago I posted a link to a Bruel and Kjaer paper that talked about phase and timing in speakers as though it was ‘obvious’ that errors would colour transients – it did seem logical. I also linked to a speaker review that went into great depth on the subject of various passive crossovers and their phase shifts and how they colour transients, even though their frequency response plots would be flat. The author did seem quite convincing, as though these effects were clearly audible to him.

    I am merely starting from the opposite side of the argument: that the aim of the system is to reproduce the input waveform unchanged in any way, and if we can do that, listening tests and their potential pitfalls are not required. Whether that is possible is another matter..!

    Liked by 1 person

  3. Excellent blog. Another thing to avoid is being dogmatic. Categorical statements are the ones to test most rigorously. Yes of course phase, time delay (NOT the same!) and such “matter”, but within what constraints? If the tweeter lags the woofer by 1000 msec, yes that is a problem! Audible. If both are delayed by 1000 msec, that should not affect the sound at all, but maybe the sync with the video if applicable. Some claims can be proven false from first principles. E.g.: right now I’m listening to a pair of Bose 901 (series I or II). The sound is “stereo” and “sounds good” (to me). It has been frequency EQ’ed to my tastes. But is it “accurate”? Of course not. At minimum, there are two cabinets. Each has nine drivers, one with a “direct” path to my ears, and eight with “reflecting” paths. I’ve never heard pinpoint imaging from 901’s nor would I expect to. But if I listened to Danley’s Unity or Synergy I would expect to.


  4. Cheers for the comment. I still feel pretty dogmatic about phase and timing, though! I’m still not convinced that phase distortion is OK as long as it is identical for both speakers – I would avoid it for mono, too. At the very least it is colouring transients (the leading edges of notes, for example) but it is also breaking the ‘equation’ that links the acoustic source (voice or instrument) with its surroundings. Unless we take a very simple view of how human hearing works, it is something I want to avoid.


  5. The impulse response and the frequency response are related in such a way that if you have perfect impulse response (transient), you also have a perfectly flat frequency response. That is just a mathematical fact of linear systems. Of course none of this is completely linear, but thats not what we are talking about. Any problems in transient response will show themselves as peaks and valleys in the frequency response. Think about it; if the impulse response is ringing, it will do so at some frequenc(y/ies), which will show themselves in the frequency response. If it has a slow rise time, it simply has poor high frequency response. Indeed the frequency response is simply the fourier transform of the impulse response. A perfect impulse will show itself as a perfectly flat frequency response. So my point is this; Mr. Toole does not neglect the transient response, but he is so used to the equivalence between frequency domain and time domain, that he glosses over it so quickly that a non-expert might think he neglects it. I suspect some resistance to this point, but please try and see if you can produce a toy linear system that has a messy transient response and a flat frequency response. I dare you.

    For good measure lets talk about non-linear effects. One of these is what is called harmonic distortion. Basically it generates frequencies that are not present in the input signal. For example an infinitely long 1 khz tone at the input of a linear system will always give only a 1 khz output at some phase and amplification. Harmonic distortions will create additional harmonic frequencies. For example tube amps often create a lot of even harmonics, which is said to sound “warm”. Basically the more linear the speaker, the more transperant it is.


    1. You are correct that a perfect impulse response will give a flat frequency response. However, a flat frequency response doesn’t guarantee a perfect impulse response.

      Don’t forget that the overall response of a speaker is cobbled together out of several drivers, and usually a port whose output is delayed relative to the bass driver. Put them all together with some phase-shifting filters and you can achieve a flat frequency response but an imperfect impulse response.

      At random, here is a speaker reviewed by Stereophile:

      “The Blade Two offers a superbly flat, even response on the tweeter axis. Wow!”.

      But note that even though the frequency response is perfectly flat, the step response isn’t a perfect step.

      I guess it is a variation of the idea of the all pass filter.

      “You are familiar (or at least should be) with low-pass and high-pass filters. As their names imply, they pass one part of the audio spectrum while attenuating the others. They are the basis for loudspeaker crossovers. As with any analog filter, there is phase shift associated with the change in output magnitude of these filters…
      If you were to add the output of these filters back together, essentially combining them with a unity gain mixer, you would get the response in Fig. 2. Here you see that the magnitude response is perfectly flat; it passes all portions of the audio spectrum, thus the name all-pass. However, the phase response is not flat. There is phase shift associated with this all-pass behavior. This phase shift is due to the contributions of the low-pass and high-pass filters that were combined (mixed).”

      “From an intuitive standpoint it would seem that a flat magnitude response in the frequency domain should correspond to an impulse response (time-domain) containing only a single delta function (recall that the impulse function and a constant-valued function constitute a Fourier transform pair). However, the all-pass filter’s impulse response is actually a large negative impulse followed by a series of positive decaying impulses.”

      (I don’t claim to be an expert on this stuff – I have never designed a passive crossover in my life! With my speakers I have just bypassed the whole tricky problem…)


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s