Linux-based active crossover: getting there

A few weeks ago I wrote about my desire to dump Windows and to go with Linux for audio. The aim is to create an active crossover system that is the best of all worlds:

  • completely flexible, programmable down to bit level (I am going to program it – or pretty much port my existing code from Windows)
  • powerful enough to implement any type of filtering (large FIRs in particular)
  • not dependent on specific hardware – can use a variety of low cost PCs including old PCs at the back of the cupboard, fanless, compact, low powered, dedicated DSP cards.
  • all libraries, drivers, compilers are open source; not beholden to commercial companies
  • capable of streaming from a variety of sources without sample conversion
  • not bogged down with continuous updates and anti-virus shenanigans

The goal is to use DSP to replace the passive crossovers that so-degrade conventional speakers’ performance, not merely to use the PC as a ‘media hub’. The Linux-based audio system can do this, and despite its workaday image represents the ultimate hi-fi source component. Hi-fi sustains an industry, and hordes of enthusiasts are prepared to spend real money on it. What an interesting thought, therefore, to realise that as a source there will never be any need for a better component than the ‘Linux box’. Here exists a system, a general purpose number cruncher that is powerful enough for all audio applications, bristling with connectivity, easy to equip with digital to analogue converters whose raw fidelity have long surpassed the limits of human hearing, and yet (if you use an old, surplus PC) costs less than a Christmas cracker toy to own – unlike an equivalent Windows PC.

Details, details

Regardless, reading around the web on the subject, for my active crossover system I seem to either have unique requirements that no one has ever thought of, or my requirements are just so trivial as to be not even worth writing down by anyone. I am still not sure which it is…

On the face of it Linux seems to have audio covered and then some, but in amongst the fantastically comprehensive JACK solution I don’t really feel I know what is going on. It feels like overkill. Is the audio being resampled? I think I need a simpler solution.

Just to summarise the thinking behind my requirements:

  • I want to design my own DSP system rather than trying to adapt existing systems.
  • I want to be able to understand exactly what is going on.
  • Dedicated digital signal processing systems are relatively expensive, often not very powerful, and in order to get the most out of them they may entail a considerable learning curve without the effort being applicable elsewhere, whereas PCs running Linux are ridiculously powerful and cheap.
  • Linux can be installed on any PC for free, and there is no danger of The Powers That Be decreeing that it must be ‘upgraded’, with the high chance that the system will be broken by the upgrade. For example, the mandatory ‘upgrade’ from XP to Windows 7 broke my current system, entailing the fitting of a second sound card due to a change in functionality of a sound card driver. And it cost money.
  • I want the best of all worlds: to be able to program the system at low level as though it is a microcontroller sending samples to a DAC, but for it also to have nice GUIs, play CDs, run Spotify without the need for any other piece of hardware linked with a cable.
  • It would be nice if the system would run on any old PC e.g. fanless.
  • It would be nice to be able to use any sound card as the multichannel DAC.
  • I don’t want the system to resample the audio. This is the ‘killer’ requirement that, I think, most people never give a second thought to.

That last requirement is what the whole thing is about. It is nothing to do with conversion between 48 kHz and 44.1 kHz, or 96 kHz and 192 kHz, but is about the resampling that would be necessary in going from 44.0999 kHz to 44.10001 kHz, for example; if the source and DAC are at nominally the same sample rate, but use separate crystal clocks they will drift apart over time. This can be handled using adaptive resampling of the audio stream in software. Resampling would involve extra DSP, so even if I was happy that no audible degradation was occurring, it would be sapping more CPU power than was necessary, or relying on a particular type of sound card that does its own resampling.

The alternative is to ensure that the source and DAC are synchronised in terms of their average sample rates. The DAC will have a fixed, rigid sample rate, so the only rate that can vary is the source and, if the source is a stream of bytes from an audio application (e.g. a media player program), this synchronisation can be arranged by requesting chunks of data from the source only when the DAC is ready to receive it. A First-In-First-Out (FIFO) buffer is loaded with these chunks of data, and the data is streamed out to the DAC continuously.

I would like to think I have now found the solution using Linux. I would be very grateful if any Linux gurus out there would care to correct me if I am wrong on any of this:

  • Linux has several (confusing) layers when it comes to handling audio. However, most audio applications will work directly with ALSA, which allows fairly low level programming.
  • Typical Linux distributions also come with Pulseaudio loaded and running. Pulseaudio is a higher level system than ALSA and has many nice features, but automatically performs resampling(?). Pulseaudio can be removed.
  • Another step up in sophistication is JACK, a very comprehensive system that requires a server program to be running all the time in the background. There is no obligation to set JACK running.
  • As with Windows, fitting a sound card into a Linux machine causes the driver for that sound card to be loaded automatically. ALSA can then ‘see’ the card and it can be referred to as “hw:3,1” where the ‘3’ is the card, and the ‘1’ is a device on the card, or using aliases e.g. “hw:DS,1” etc. – this is useful because the numeric designation may change between boot-ups.
  • “hw” devices are accessed directly without any resampling. as opposed to “plughw” devices. Both options are usually available for most sound cards and their drivers. I am only considering the “hw” option.
  • Driver capabilities can be ascertained in detail by dumping the driver controls to a file using various methods e.g. “alsactrl store” etc.
  • Linux provides drivers that have been put together by enthusiasts based on sound card chipsets, so not all the facilities listed by the driver will necessarily be available for every card.
  • ALSA’s API allows real time streaming to and from ALSA devices, including multichannel frames. Taking data from a device is known as capture, and sending to a device is known as playback (or similar).
  • A device can be designated as the ALSA default, which most audio applications default to sending their output to. Applications like Spotify can only direct their output to the default device.
  • There is a ‘dummy’ driver available called snd-aloop. This can be loaded into the system at boot-up. To ALSA it appears as as a sound card called Loopback with eight capture devices and eight playback.
  • snd-aloop can be designated as the default device.
  • snd-aloop has a very desirable feature: its sample rate can be varied via a real time control. This control is accessible like the controls that are available on any sound card driver and can simply be set from a terminal using a command such as “amixer cset numid=49 100010” where 49 is the index of the control and 100010 is the value we are setting it to. The control can also be adjusted from inside your own program.
  • Clearly, if a way can be found to compare the sample rates of the DAC and snd-aloop, then snd-aloop‘s sample rate can be adjusted occasionally to keep the source’s average sample rate the same as the DAC’s. N.B. this is not dynamically changing the pitch or timing of the stream – this is fixed and immoveable and set by the DAC – but merely ensures that the FIFO buffer’s capacity is not exceeded in either direction. If the source was not asynchronous (e.g. not a CD or on-demand streaming application whose data can be requested at any time) but a fixed rate stream with no way of locking the DAC to its sample rate via hardware, then this would not be possible, and adaptive re-sampling would be essential.

After a few days of wrestling with this, my experience is as follows:

  • Removing Pulseaudio from Ubuntu (“sudo apt-get remove pulseaudio –force” or similar) has side-effects, and the system loses many of its GUI-adjustable settings options because various Gnome-related dependencies are removed too. It doesn’t ‘break’ the system; merely makes it less useable. The solution can be as crude as re-installing Pulseaudio in order to make a settings change and then removing it again! I don’t know that it is essential to remove Pulseaudio, but it certainly feels better to do so.
  • Various audio apps are happy to play their outputs into snd-aloop, and my software can capture its output and process it quite happily.
  • The real core essentials of using the ALSA API for streaming are straightforward-ish, but documentation beyond a simple description of each function is sparse. In many cases, the ALSA source code is viewed as being sufficient documentation. As an example, try to find any information on how to modify an ALSA driver control without actually delving into an existing program like amixer to try and work it out. I find that most ‘third party’ tutorials seem to obscure the essentials with multiple equivalent options demonstrating all the different ways that a single task can be performed.
  • My ASUS Xonar sound card may yet turn out to be useful now that I don’t have to worry about using it as an input as well as an output: it is a high quality eight channel DAC that seems well-behaved in terms of lack of ‘thump’ at power-on and -off.
  • I found the easiest way to adjust the snd-aloop sample rate dynamically was by cutting and pasting the source code for the standard ALSA/Linux program amixer into my program (isn’t open source software great?) and passing the commands to it with the same syntax as I would use at the command line.
  • The system seems stable and robust when the PC is doing other things i.e. opening up highly graphical web pages in a browser. No audible glitches at all and no jump in the difference between my record and playback sample counters.
  • I am, as yet, unsure as to the best way to implement the control loop that will keep snd-aloop and the Asus Xonar in sync. With a snd-aloop rate setting of 100000 i.e. nominally neutral, there is a drift of about one sample every couple of seconds (an evening’s worth of listening could be assured without any adjustment at all by have a large enough FIFO and slightly-longer-than-desirable latency…). I am currently keeping a count of the number of samples captured vs. the number of samples sent to the DAC and simply swapping between fixed ‘slightly slow’ (99995) and ‘slightly fast’ (100005) snd-aloop sample rates, triggered when the (heavily-averaged) difference hits either of two thresholds.
  • In terms of the ALSA sample streaming I just use the ‘blocking method’ inside two separate threads: one for capture and one for playback.
  • It occurs to me that this system could be used to stream to an HDMI output, thence to an AV receiver with multiple output channels. Not sure if the PC locks to the AV receiver’s DAC sample rate via HDMI (is it bidirectional?), or whether the AV receiver resamples the data, or syncs itself to the incoming the HDMI stream.

You may find it hard to get excited by this stuff, but not me: it’s a case of feeling that I own the system rather than my recent experiences that showed that with Windows the system is merely ‘under licence’ from Microsoft and the hardware vendors.

10 thoughts on “Linux-based active crossover: getting there

  1. It sounds like a lot of the complexity in your design is coming from the desire to use the source sample rate dynamically throughout the processing chain. Why not just let the processing run at the maximum sample rate of the DAC and upsample all your source material to that right at the start of the playback chain? The math of sample-rate conversion is solid and well-understood. This way, you can just hook your filtering and crossover software to the DAC through ALSA and then pump in whatever material you want without needing to mess around with the sample rates internally.

    Like

    1. Many thanks for the comment agd. Yes, I can see how resampling can make connecting digital audio systems together as simple as analogue, and I think JACK relies on this for its virtual patchbay concept. But in this particular instance I think I would be wasting CPU power and a minuscule amount of quality. I think snd-aloop comes close to being ideal for for my application, but the best solution of all would be for me simply to be able to pass requests to the dummy driver for chunks of data in response to the output card’s requests for data, rather than adjusting the sample rate which is a rather indirect way of achieving the same thing. For all I know it would only take a few lines of code to create a driver to do this.

      Like

  2. Two links that might be of interest to you –

    http://rtaylor.sites.tru.ca/2013/06/25/digital-crossovereq-with-open-source-software-howto/

    https://github.com/bmc0/dsp/wiki/System-Wide-DSP-Guide

    I also am in pursuit of a linux based dsp crossover to work with the Kodi media center. The problem is that Kodi will only send audio output to an alsa device and not to a program for post processing. Thus I need to understand how to program in alsa (.asoundrc). The second link looks promising for my purposes although the documentation could be a lot better. I think the program must rewrite .asoundrc based on a data file that the user constructs.

    Like

    1. Many thanks Alan. I have seen the first link before, but I note that the author is not too concerned about using resampling to synchronise the source and the DAC. This was something that I specifically wanted to avoid if I could. My current solution (using the modified snd-aloop dummy driver and controlling its sample rate from my software) does actually work perfectly for this application. I believe that Kodi is happy to send its output to snd-aloop, too. The main drawback is that my version adds latency which does vary cyclically over several minutes although maybe I could tighten that up a bit if I tried. It is also decidedly ‘off grid’ as a solution, and I don’t think could be recommended as a solution for anyone else!

      Allowing ALSA to resample (i.e. using the ‘plughhw’ variant of any ALSA device) is by far the easiest and most portable way of doing it.

      Have you looked at snd-aloop and/or alsaloop to get your system going? KODI can send its output to the dummy loopback driver, and your program can take its data from it. Linux will resample transparently, and you can select the quality (and consequent CPU load) you need.

      Like

  3. If you use Librespot you can specify it write to a pipe. Here’s the command I use:
    ./librespot-v -n hello -b320 -c audio/librespot –enable-volume-normalisation –backend pipe –device audio/librespot/audiofifo

    The pipe will block when you’re not reading from it which is the mechanism to keep Librespot in sync with your soundcard’s sample rate. If you continually read from the pipe you’ll actually see a 3 minute song fly by in ~ 20 seconds which proves the source rate is well under your control. I don’t recommend continually doing this as spotify will probably shut you down for suspected song ripping or something. Then you simply setup the reads to keep the sound card buffer happy. Something like when (avail >= period_size) {read a period’s worth of data from the pipe} has worked for me, though being a little smarter about it allows you to reduce buffer size some and thus keeps latency down. I haven’t worked with the DSP stuff yet so I’m not quite sure how to slot that in the middle yet.

    I was actually trying to use snd-aloop to not have to modify source programs like Librespot. I currently have it and shairport-sync modified to disable their internal soft vol and also pass me the commanded volume as an integer via stderr which I can then use however I want, like pass along to my receiver via it’s Ethernet API. I’m now running custom code though which makes getting updates from the author less easy. An alternative is if you specify an alsa mixer control to librespot or shairport-sync that also disables the internal softvol and instead writes the commanded volume to the alsa mixer. So I was planning to use snd-dummy to create a dummy mixer to capture the commanded volume and snd-aloop to capture the actual audio when I discovered the clock mismatch issue which led me to your site.

    So after reading your thoughts I’m currently weighing the convenience of easy updates from unmodified librespot and shairport-sync vs reduced complexity of eliminating snd-aloop. If I get get snd-aloop working without glitches I’ll probably stick with that.

    Currently I’m working on multiroom audio where each location has it’s own librespot and airplay sync, then a server which has librespot and airplay syncs for combinations of different rooms. When I pull up spotify or airplay on my phone I’m currently presented with a list like [kitchen, bedroom, livingroom, familyroom, sunroom, all]. The combinations like [upstairs, inside, kitchen and livingroom] are what I’m working on now. All locations with a TV also capture TV audio via toslink and a hifiberry dac+dsp. This is all done with nodejs for the fancy stuff and c for the heavy lifting and obviously ALSA interface. Nodejs is a great solution for fancy web enabled graphics which is why I mention it to you.

    The multi room synchronization is basically done with htstamp combined with delay to get accurate time targets then either dropping or duplicating samples to keep in sync with the source. I can’t hear dropping or adding a single sample, so the tradeoff for reduced required CPU power and the benefit of multiroom syncronized audio is an OK tradeoff for me, especially since dropping or duplicating a sample a minute seems to be all that’s required to keep all my soundcards in sync. Except for the Denon receiver connected over HDMI they’re all from hifiberry and all have their own oscillator clocks, so maybe that helps.

    All of this runs over wifi which is actually working quite well. It’s good enough that I can’t hear the adjustments or phase shifts in a continuous sine wave until up over 7khz, which is probably fine. I can’t hear it at all in real music. This software is what’s allowing me to currently listen through send-aloop without accumulating latency or risking XRUNs as it’s actually duplicating a single sample about every two seconds to keep things in sync. There’s only one clock involved here not counting the semi-artificial snd-aloop clock, so the sample adjustments aren’t strictly necessary. Thanks very much for documenting that snd-aloop rate can be manipulated! A great point in the right direction.

    Once this is done I plan to convert an old set of JBL L100s to active independent wifi speakers. This will involve 3 raspberry pi’s (source, left, and right) on a dedicated wifi network, two hifiberry dac+dsp (left + right), and 3 channels of class D power per speaker. For this I’ll be diving into LADSPA and other stuff like it I’m sure. If you’re interested in trading knowledge I’m all for it.

    –Michael Szilagyi

    Like

  4. Hi, if you want you can just disable pulseaudio systemwide instead of removing it. Moreover, you can disable it for just a particular user if needed. This is how I am using it on 18.04 with great success. Let me know if you have questions.

    Like

Leave a comment