UPDATED 16/03/15 Approximately every two years I find myself inspired to have a go with Linux. I install Ubuntu on an old PC and congratulate myself on having finally made the right choice. Everything works fine: all the devices are auto-detected correctly, and although the graphics and text are a bit lumpy, it looks as though it can do everything Windows can do. It never lasts. Within a short time I try to do something beyond the basic web surfing and word processing and it doesn’t quite work. So I go to the web, and of course there’s usually a solution buried in a forum somewhere, and it invariably involves editing a config file. But along the way I may have found several other ‘solutions’ that didn’t work, and for each I maybe edited a different file or changed something using some little app I’ve installed. At the end, even though the system may be working, I am never quite sure how I got there, nor confident I could reproduce the same working system on another PC.
Well, the time has come again, and I am typing this using the latest version of Ubuntu. Everything is wonderful so far, and even Spotify is running flawlessly. Specifically, though, I want to get my active crossover system working on Linux, not Windows. My experience with Windows 7 running on slightly older PCs is not good. I have a laptop approximately 5 years old which will grind almost to a halt for several minutes every day, performing some sort of scan of itself, and I don’t know enough to do anything about it. The desktop PC that I use for the active crossover is slightly better, but it, too, takes quite a while to ‘warm up’ and is also prone to the occasional glitch while playing music, due to deciding to update its anti-virus database – I am sure it was not a problem with Windows XP. In contrast, running Ubuntu on an older desktop PC without much RAM, the experience is one of ‘solidity’. I am not experiencing the operating system going AWOL for several seconds at a time. But it comes at a price. I really, really don’t want to have to understand the details of any operating system, and Windows is good for the person who maybe wants to dip into a bit of programming (a distinctly different activity from IT) without having to worry too much about the really low level details. Windows feels as though it is ‘self-healing’. Every time the PC is turned on it starts scanning itself, checking for inconsistencies, downloading updates. New hardware is detected automatically and the user never edits configuration files. Ubuntu feels a little different. By all means correct me if I am wrong, but the impression I get is of a system that is dependent on lots of configuration files that are not hidden from the user. Of course these files get changed by the operating system itself (just as Windows must change its hidden configuration files) and there are little applications that you can install that simplify changing the parameters of various sound cards, say (more on this later). But occasionally the configuration files must be edited by the user using a text editor. One typo, and the PC may refuse to boot!
As I mentioned, I am hoping to run my active crossover stuff on Linux, not Windows. In order to achieve this I must loop continuously doing the following:
- Extract a chunk of stereo audio from an ‘input port’ that receives data from my application of choice (media player, Spotify etc.)
- Assemble the data into fixed-size buffers to be FFT-ed.
- Process with FIR filters to produce a separate, filtered output for each driver.
- Inverse FFT.
- Squirt the results out to six or eight analogue channels, or if feeling ambitious, HDMI (that would be the dream!).
It’s a very specific, self-contained requirement. I can handle numbers 2 to 4, no problem. 1 and 5 are the tricky ones, and seem to be a lot trickier than they, perhaps, might be. They weren’t all that easy in Windows, either, but I eventually came up with a scheme that kind of worked.
Here’s where it gets very specific: under XP I was able to use a single Creative X-Fi surround sound card as both the ‘receptacle’ for PC audio which I could then access with my application, and also as the multichannel DAC that my application could squirt its output to. Under Windows 7 the driver for the sound card was ‘updated’ and I could no longer access it as the receiver for general PC audio – I could still have used it for S/PDIF, analogue Line In etc., however. In the ideal world, the ‘receptacle’ would just be some software slaved to the output sample rate, I think, but I don’t know how to create such a piece of software – it would appear to Windows to be a driver I would guess. I could buy a piece of software called Virtual Audio Cable but I could never be sure whether that would always be re-sampling the data, and I’d rather avoid that. In the end, I used a method that I knew would work: I slaved a ‘professional’ audio card to the X-Fi using S/PDIF from the X-Fi. The M Audio 2496 can slave its sample rate to the S/PDIF (using settings in the M Audio-supplied configuration application) so I was able to send PC audio to the M Audio and my application could extract data from its ‘mixer’ at the same sample rate. Keeping the input and output on separate cards like this has some advantages when it comes to making measurements of the system while it is working, I think.
As a start I will probably try to do the same thing under Linux. I am attempting to use an Asus Xonar as the multichannel DAC, and another M Audio card I had lying around as the slaved source. It’s almost certain that I could achieve the objective without a second sound card, but I really don’t know how to do it [update 30/06/15: maybe I do know how to do it now]. Linux audio seems to have several ‘layers’ that I don’t understand (but as yet I have no view of them as layers, more as spaghetti). Really, I would like not to have to know anything about them at all, but this seems unrealistic. I have established the following:
- I can do lowish-level audio stuff using the Alsa API. I can refer to specific cards by names that I can bring up with certain command line (shell) queries. Are these names guaranteed to stay the same in between boots? I don’t think so, but there are ways of editing the config files to associate names I choose to specific cards – I think.
- There is a highly comprehensive system called JACK that allows “JACK-aware” programs to have their audio routed via a user-configurable patchbay. It can handle re-sampling between separate cards transparently. Brilliant, but I don’t think Spotify is “JACK-aware” for example so I’m not bothering with it. [Update 30/06/15: I want to avoid any form of re-sampling anyway]
- Ubuntu has PulseAudio installed already (I think) and using an application (that I had to install) called Pavucontrol I can direct Spotify, and presumably other apps, to send their outputs to any of the sound cards in the system. Does this get written to a file and saved when I exit it? I think so. PulseAudio may be the thing I need, possibly being capable of creating software “sources” and “sinks”. But is it always resampling the audio to match sample rates even when that is not needed? More investigation needed. [Update 30/06/15: Pulseaudio cannot be guaranteed not to resample. I have removed it from the machine].
- I installed a little program called Mudita24 that gives me most of the functionality of the app that is supplied for M Audio cards under Windows. It will let me slave the M Audio to S/PDIF. But without a lot of rummaging around on the web, finding this solution was not obvious. Will the results be saved to a file so I don’t have to call this up every time? I don’t know. [Update 30/06/15: the M Audio-compatible drivers don’t seem to work properly. I have abandoned this idea].
- I found a “minimal” example program that can send a sine wave to an output via Alsa. The program is anything but minimal and allows the user to select from a large number of alternative sample rates, bit depths etc. etc. and has copious error reporting. My version of “minimal” is much shorter! I adapted the program for eight channels, and am sending a separate frequency to each of the Xonar’s outputs. It seems to be working quite solidly. I can’t be absolutely sure that the Xonar isn’t applying surround sound processing to the signals yet, though. Question: should I be programming using Alsa or PulseAudio? [Update 30/06/15: answer is most definitely ALSA only].
I don’t mind if everything is low level, nor do I mind if the operating system handles everything for me. What I am not keen on is a hybrid between the operating system doing some things automatically, and yet having to manually edit files (I haven’t done that yet, though) or having to install little apps myself. How are they all tied together? I don’t know.
UPDATE 10/03/15 Installed Ubuntu on my erratic Windows 7 laptop. On the hard drive I had to delete the ‘HP Tools’ partition to do it, as a PC can only have four partitions, apparently, and HP had used all four to install Windows – the things you learn, eh?
For the things I use the laptop for mainly, Ubuntu is knocking Windows 7 into a cocked hat. It actually responds instantly and doesn’t hang for tens of seconds with the disk light on constantly and the mouse pointer frozen. It’s taking some getting used to!
UPDATE 15/03/15 It is becoming clear to me that there is only one sensible solution for what I am trying to achieve (an active crossover / general DSP system under my control that can be applied to any source including streaming) that is guaranteed not to resample the data, nor is dependent on sound card-specific features, or needs two sound cards. Let me run this by you:
- Media player apps need something that looks like a sound card to play into. Some apps will only play into whichever card is set as the default audio device.
- If it’s a real sound card that’s being played into, I need to extract the data before it reaches the analogue outputs. This just may not be possible with many sound cards, and it is impossible to know without trying the card – no one cares about this issue normally.
- I process the data into six or eight channels and then I need to squirt the results out to, effectively, some DACs (or HDMI). This is most likely a real, physical multi-channel sound card.
- I believe that the media player’s sample rate is defined by the sound card it is playing into. If so, this is akin to asynchronous USB mode i.e. the media app is slaved to the sound card’s sample rate.
- I would like to avoid sample rate conversion (and this would still be needed to convert between 44.09999 kHz and 44.10001 kHz i.e. there is no such thing as “the same sample rate” unless they are derived from the same crystal oscillator).
There is a Linux driver called snd-aloop which can act as a virtual audio node, recognisable by media player apps as a sink, but also recognisable by other apps as a recording source. I could send media player output into this virtual device, recognise it as a source for my application, process the data and send the multi-channel audio to a consumer-level DAC card without it needing any special features. However, there is a subtle problem: aloop’s sample rate is derived from the system-wide “jiffies” count. It will not match the sample rate of the DAC card even if they are both nominally 44.1 kHz.
I see just one sensible solution: I have to modify the aloop code so that, when the information is available, it gets its sample rate synchronisation from the DAC card. I could either modify aloop and send it this synchronisation information via a ‘pipe’ or shared memory (if that’s possible) from my active crossover application, or I can make my active crossover application a virtual sound card driver itself. Either way, I would need to register the driver with the system so that it can be set up as the default audio device (using the usual GUI-based sound preferences).
To any Linux programmers out there: does this sound sensible and do-able?
Update 30/06/15: It seems that there is an updated version of the snd-aloop driver which incorporates a dynamically-adjustable sample rate via the Alsa PCM interface. This could be precisely what I need.