Delay in Large Format Digital Music Consoles / John D. Klett

Reading @ TechnicalAudio.com

Delay in Large Format Digital Music Consoles
by John Klett (c) 1999-2011 by John Klett, Carmel, NY
http://www.technicalaudio.com

(This article may be reproduced for noncommercial purposes only if it is copied in its entirety, including this notice and all notices of copyright and authorship contained on the originating web page at www.technicalaudio.com and in the article itself - information directing readers to the TechnicalAudio.com web site would be appreciated - (c) 1999-2014 by John Klett, Carmel, NY - mailto:techmecca@me.com CONTACT - John Klett

Release Version 1.0 - 12 September 2000

This document is Release Version 1.0
Version 0.9.2 is the beta test version I put up on www.technicalaudio.com in November 1999.

In Release Version 1.0 I fixed/updated some wording, corrected a few typos in the main body of the document and then added a section at the end called "Post Processing for Release Version 1.0"

Large Format Digital Music Consoles - an article by John Klett - Release Version 1.0

Introduction and Preamble Bits

All technologies have advantages and disadvantages. Large Format Digital Audio Consoles are no exception. This paper focuses on specific issues dealing with the application of large format digital consoles to music production.

We are always looking for solutions to existing problems and ever more powerful tools to work with. It is often the case, when a new technology comes along, that we see in it what we want to see well before we notice any new problems that may come along in the same package. Manufacturers invest effort and expense developing new things to sell us and, of course, they are going to tell us everything we want to hear about their new products and little, if anything, about potential problems we may encounter as a result of the new technology. We, as users, want to hear all the good things and are often unwilling to think too much about what new problems may result - we don't want to spoil the party. I would venture to say that the desire for new tools and technology as a means of maintaining an edge over the competition has us pressurized to the point that skepticism is actively suppressed in our community. Those who point out problems are considered, at least, "party-poopers".

The purpose of this paper is to condense a number of thoughts on large format digital consoles, how they work in a music production environment and with specific regard to the latencies and processing delay inherent to all digital consoles. This particular paper started in October 1999 when I attended the 107th AES show in New York on the behalf of a private producer-owned facility. My task was to take a look at large format digital music consoles in order to re-evaluate some recent purchase decisions and see if things have changed enough over time to warrant a change in equipment. In September 2000, this is still an active issue. None of the manufacturers of these products offer a complete system of management for all aspects of delay and latency in digital music consoles. Currently the Sony Oxford has the most complete and functional delay management system but it does fall short of the ideal in some aspects. The ideal would let the operator work on a digital console without having to think about latency at all - pretty much as one would on an analog console. It may be that we will never achieve this goal but I feel we can get a lot closer if we can define and focus some attention on some specific problem areas.

Main Bits

There are a number of large format digital music recording consoles available and quite a few are finding their way in to studios. New, powerful technologies bring us a whole new set of production tools to add to those we already have. New problems have come with this technology as well. There are new challenges for design engineers working on the "front-end". They have to figure out the best way to present all the functionality and power to the operator in some comprehensible form such that the new user interfaces will not require long and steep learning curves. The design engineers working on the "back-end" have to choose from a number of emerging or mature processing technologies and develop an engine and software to run on it. They must come up with the best set of compromises to satisfy sets of competing priorities from end users, marketing departments and financial officers.

At the end of all this development, we have almost a dozen products that can be considered for filling the new role of large format digital console for music production. These new consoles are replacing large analog desks with new control surfaces that control powerful digital processing engines that provide all the functionality of their analog predecessors while offering a host of new options to the end-user.

We have a number of approaches to the control surface in evidence. The SSL Axiom MT offers the most traditional "knob per function" control surface. It is not totally knob per function as things like auxiliary sends are "paged" but it is close and very familiar looking. SSL has the largest user base of all manufacturers for their analog multitrack music consoles and the Axiom MT should be a very easy transition for these users. SSL has always offered operational consistency across their music console line from the very beginning to the present. The majority of the control surfaces give the user a bank of knobs and display on a per channel basis. These "paged knob per function" control surfaces are less cluttered and offer varying degrees of presentation of a more comprehensive single channel set of functions in the center or master section. The Sony Oxford offers very a minimal number of controls on a per channel basis and spreads all the controls and informational displays across the desk to provide a very comprehensive set of controls for one (or two) channel(s). All of the control surfaces I have looked at are comprehensible and learnable. As in the early days of large analog mixing desks, we will see a number of vastly different approaches evolving and merging into variations on a common theme.

The back-end or "engine" is where everything is happening. The newer, faster and more powerful engines are using widely supported off-the-shelf DSP processors like the Analog Devices Super Harvard Architecture (SHARC) DSP chip sets. These systems have hardware and software architectures that permit incremental or modular upgrade, as newer and more powerful processing technologies become available. SHARC and similar processors and software development tools are widely used and supported across many applications, industries and platforms. The development costs and product lead times are vastly shorter than was the case in systems that were in development only five or six years ago. Older engines were developed and evolved out of the proprietary hardware and software that was the only thing available to the pioneers in this product category. Manufacturers who got into digital console technology early, broke the ground and made the heavy investments in research and development were living on the bleeding edge. Today some of these manufacturers appear to be overly invested in hardware and software architectures that may be more difficult and expensive to upgrade. They may have to abandon large parts of their older technology before development costs have been recovered.

Back-end performance has to be judged by the overall performance of the console system, the tools offered and by how it sounds. One can evaluate how much power they can buy in a system vs. dollar cost. This is a perfectly valid way to look at it. On the other hand one can ignore cost and look at audio performance and power as a production tool in an absolute sense with no reference to cost. Technical critique of DSP/Processing architectures, conversion technologies and other supporting elements provide a good background for understanding some of the underlying aspects that make one product better than another - at least in theory. As a practical matter I leave that to others and look at the product through the eyes and ears of whichever client I am working for. Does it work for the client and does it sound good?

As a systems designer and integrator I have to concentrate on many of the physical aspects and certain interface aspects of the back-end. Ultimately these aspects should have nothing to do with the choice of any particular system but certain things like timing references, synchronicity and sample rate conversion do have quite an impact on overall system performance and sound quality.

Sample Rate Conversion (SRC) is always an issue. No SRC is 100% sonically transparent though some of the DSP based converters are very good. The SRC offered in most all the digital consoles being manufactured today is to be avoided at this point as they are using the Analog Devices AD 1890 ASIC (Application Specific IC) which is not 24 bit transparent. The AD 1890 truncates to 20 bits at its input with every pass through it. In the ideal all if the various devices in a given system should be running synchronously so SRC's are not required and they can be bypassed. In order to do this I want to be able to lock everything to a master clock source. I try to avoid video as a reference for digital audio gear. Locking to video requires a Phase Locked Loop (PLL). Few PLL's can lock and stabilize quickly, be very stable over time AND maintain low jitter. I always look for an AES, Word Clock (WCK) or some form of over-sampled WCK reference input.

Various system architectures present advantages and disadvantages with regard to installation issues. A system that uses MADI or any similar multi channel serial interface to connect a system processing core to various i/o "racks" might permit these racks to be remotely located - i.e. mic preamplifiers moved local to the studio, analog i/o moved local to outboard analog processor racks etc.. Things like fan noise would then become important with these remote i/o boxes - where it would not be a factor at all in a machine room environment.

All of what I have mentioned up to here is fairly straight ahead stuff and over time we will see what works and what doesn't simply by looking at how the various consoles fare over time. Though it often seems that sound quality and musicality are not on the top of everyone's list of priorities it is considered along with all the other factors and, of course, marketability. Here we have studio owners and managers deciding what to buy based on their perception of what will sell the most studio time and amortize the quickest. Historically this has often not had much to do with absolute quality of sound but more to do with how many people are out there who will be able to walk in and comfortably drive the desk. At some point we may see one of the large format consoles take on the digital console equivalent role of the SSL 4K. Which manufacturer fills this market segment is for the market to decide.

The Time Axis - Delay is a huge issue...

A topic raised with digital consoles repeatedly has to do with delay time or latency. This topic and numerous related issues are of great concern to my client so I spent quite a bit of time looking closely at how the various manufacturers deal with it.

Every console on the planet has some propagation delay - we know this. Digital and Analog alike - it is there. It is just a matter of degree and once delay gets up over a certain point it can be viewed as significant. In an analog console "significant" has been defined a number of ways. In terms most people can agree on we can say that if the propagation time causes a relative phase shift channel to channel of greater than 90 degrees at 20KHz there is a problem. Problems that can't be made to vanish have, at least, to be managed... and here is where digital consoles in general have a problem.

Let's look at several specific examples and define things.

1) Self-Monitoring in Cue.

Put yourself in the position of vocalist. You are very good and want a cue mix with your voice in it so you can hear yourself as you sound. You need to hear how you sound fairly accurately so can moderate your tonality.

Typically, in an all-analog path, it takes between 2 and 8 microseconds for a signal to make the trip from a microphone, through a preamp, analog dynamics and EQ to a hole in the patchbay that I would then patch in to a converter input on a console.

Typically, in a digital console path, it can take between 1.5 and 4 milliseconds for a signal to make a trip through a converter input, through whatever routing is involved, through whatever signal modifying processes are applied, through a mix process, through a DA and out a hole on the patchbay that I would then plug in to a cue amp.

You are singing. You have headphones on with the cue mix in them. Now - isolate your own voice. That voice you hear is actually a combination of two signals. Both of these are your voice.

One is the one you always hear when you speak. This is the one that travels that short path from your larynx, throat, mouth, sinuses etc. through the bones and meat of your head and to your ear... to some extent, if the headphones are not entirely sealed you also get a little bit if this through the air and through those holes on the sides of your head.

The second path is through the air to the microphone through whatever processing and console is involved and back to the cue amp and headphones.

In an analog console this second path is typically under a millisecond if the mic is close in. MOST of this delay time is acoustical. Most singers who really want to hear themselves tend to get right on top of the microphone but even at eight inches the acoustical delay will be on the order of 750 microseconds. This is about 250 microseconds longer than the shortest "through the air" path from mouth to ear. The electronic delay in an analog console adds, at most, another 10 microseconds... not much. In a studio with an analog console it is possible to have the electronic fold back (cue) timing sitting somewhere close to the acoustical "though the air" fold back. Cue never sounds perfect and it never sounds to you, as a vocalist, the way you sound to yourself in - say - a moderately live room singing with no headphones.... but it is quite acceptable and relevant to how you sound "going to tape".

In a digital console you have a lot of delay - relative to the short delay times we encounter in all-analog studios. A/D Converters have delays on the order of 750 microseconds, MADI and other routers often impose another 120 to 200 microseconds with each pass through them. Signal processing times vary widely but each process can add anywhere from 40 to 160 or more microseconds. These processes add up because EVERYTHING is a process in a digital console... ADA conversion, SRC conversion, EQ, Filter, Dynamics, Level Control (Fader), Mix, Mix Level Control (Buss Fader), and others are all processes. Internal processing delays can add between 200 and 2000 microseconds before passing on through the router again and on to the D/A converter with more time added there.

Let's just pick a typical number like 2500 microseconds for a signal to go from the microphone, through a digital console and out to the headphones. One of the two signals you (the vocalist) are hearing is delayed. How you hear the combination of these two signals is very dependent on two things beside just delay time. One is the relative mix level and the other is relative phase. When the "direct" vocal and "delayed" vocal are at relatively the same level you will hear a major difference in tonality that you would not hear in similar short-delay analog-path fold back. At 2500 microseconds this tonality shift would be heard as loss of low end or "warmth". This is due to phase cancellation between the direct and delayed signals. It is a comb-filter effect and makes all kinds of holes and bumps in the perceived response. It makes for a heavily colored self monitor sound. Flipping phase on the delayed signal will move all the bumps and holes around but it won't make the foldback more useable.

Changing the relative mix of delayed versus direct signals WILL change the severity of the problem but typically this means turning up the headphones since the direct signal is fixed. There is no gain or phase flip control on you head. As with other multi-path phase cancellation scenarios the relative mix has to put one or the other of the two signals 60dB or more away (delayed foldback 60dB LOUDER in this case) in level before things begin to smooth out. This if fine if you LIKE to listen to a very loud cue mix and don't care much about your ability to hear very well in the future.

The only way to eliminate this problem is to eliminate (practically) the delay - and today this means keeping the foldback monitor path all analog. This does not mean that the entire cue mix has to be analog - it just mean that the path through which you hear yourself has to be very short - preferably well under a millisecond - INCLUDING the acoustical delay.

When a digital console can pass a signal from the A/D input, all the way through and to the CUE Mix Feed D/A output in under 250 microseconds it can be used for foldback with similar results to an analog console.

Delay times in digital consoles are connected to sample rate and processing efficiencies. The processing technologies in both hardware and software are large factors. If we assume that algorithms are very slick and efficient, powerful processing hardware is zipping along quite zippily, signal routing paths are very short and direct and so on, then, in theory, sample rate tends to become the dominant factor. If latency time depends only on sample rate then 2500 microseconds at 48KHz will reduce to 1250 microseconds at 96KHz. At 192KHz it might approach useable (for foldback) at 625 microseconds... but we are not there yet and we won't be for some time to come. Doubling sample rate in a processing core without doubling the processing power or "throughput" will not cut transit time in half. Higher sample rates will shorten transit time through A/D and D/A converters whose latency in most cases is stated as some number of samples (though rarely an even number of samples) divided by the sampling frequency. It is doubtful that a digital console will exist in the near future that will have insignificant mic-to-cue delay time and, until foldback path delay is down near 250 microseconds, purely digital consoles won't give good head... phone.

One solution I proposed "back in the day" when we were seeing delays of around 17 milliseconds from microphone input to analog cue feed (in what I guess was a second generation digital console) was to have a console that is a hybrid of a digitally controlled analog console and a digital console/mixer. A cue mix in such a console would have the foldback from microphone to headphones handled totally in the analog domain that would provide a delay-free (read this as "insignificant delay time") foldback path for self-monitoring. This was not taken seriously then but today there is one console manufacturer that offers this as an option (Harrison - in a hybridization of the Series 12 digitally controlled analog console and the Digital Engine DSP based audio processor core) and at least one another analog/digital hybrid console product is in development that I know of.

The practical solution, at present, is to split the mic signal off and mix that with a cue mix of everything you want to hear (but yourself) from the console. Then you have the cue mix with a foldback of yourself that you can use.

2. Timing and Synchronization Channel to Channel.

Put yourself in the position of an engineer or producer who is sitting in front of a mixing desk with 96 or more inputs to keep track of and mix. You have all functions automated so when you put your attention to a particular track or group of tracks in the mix you can think about it once, do whatever manipulations you want to do and move on. Those "moves" are remembered by the automation and every time you play the mix back all the things you have done up to now repeat themselves consistently the same way every time. You can edit, correct, fine tune... EVERYTHING... until is it just the way you want. Automation makes it possible to creatively mix a large number of sources. It simply could not be done without some form of automation.

Now you decide that - while all the processing offered in the digital console is great - you want to get that fat snare sound you have always liked. You patch in an analog effect like an 1176 and tell the console to route the snare path through that effect and back.

In an analog console - as with the microphone to cue example above the delay time is quite short. Typically you have added between 2 and 8 microseconds to the snare drum path and thus pushed it back in time relative to the rest of the mix. In my measurements of analog outboard gear (no delay lines of course) the propagation times were always well below 5 microseconds and often below a microsecond. A typical transformered 1176LN measured roughly 2 microseconds from input to output while one of our passive equalizers measured 11 nanoseconds once I subtracted the cable delay. The proposition put forth by one representative from a digital console manufacturer was that delay time in analog outboard gear is on the order of milliseconds. I won't embarrass the person or the company by mentioning names.

Two things happen when you do make an insert on a channel in an analog console.

One - if there are other mics open that go to the mix and the snare drum is bleeding in to those tracks there is the potential for some shift in relative phase between those tracks. The sound of the snare might potentially change a little. At worst what would this 8-microsecond time shift do? You would see less than a 90-degree phase shift at 20KHz. It would take a delay of 12.5 microseconds to do that. It would take more significantly more delay than this to noticeably impact the sound of the snare drum in this example. This is because the bleed levels are typically not anywhere near the same level as the close-mic'ed snare and the frequency content typically does not extend strongly up to 20KHz.

Two - the delay could potentially cause a shift in perceived "feel". Feel is a very subjective thing. Let's define feel as the "relative placement in time of rhythmic elements". The character of each element will make its placement in time more or less a factor in the overall feel. Snare drum has a large contribution to overall feel in your average pop mix. The only data we have on this at present is empirical. At one time Roger Nichols is said to have defined the limit of feel perception at around 250 microseconds for key elements. My own experience watching how certain producers place elements in time on digital audio workstations brings me to the conclusion that this perception limit is more like 100 microseconds. In any case, people who are very "feel conscious" will agree that we are looking at timing shifts well under a millisecond as important and affecting feel. With the 8 microseconds of delay in the example above we are well below the perception limit for changes in feel.

In all-analog systems the two delay issues as outlined above are not a problem.

In digital consoles we have another thing all together. The same situation where you route the snare out to an analog processor and back can take between 1200 and 1800 microseconds or longer to wend its way out through a router core to a D/A converter, through the device and back, through an A/D converter, back through the router and to the mix path. This is not a small delay and it is well above the perception limit for changes in feel. It is also deep into the territory where you WILL hear a change in the sound of the snare when combined with other open "bleed" mics.

The solution here is to re-synchronize the snare to the rest of the channels in the console. This is done by adding delay to all the other channels to pad them out by the same amount of delay added by the analog effect loop. As an alternative one could slide the track forward in time on a hard disc based multitrack.

There are a number of things that come up here.

Where does the delay go in the signal path? If you slap it in as the first process in the chain then you will find that some of the busses mixed off the console may be synchronized and some may not be. Those whose mixed sources are picked off AFTER the effect loop will be in sync, assuming the delays were entered correctly. The mixed busses whose sources pick off the channels at a point before the effects loop will get the snare drum EARLY - so the feel will be "leaning forward" or rushed on those busses. This is quite a knotty problem that I won't even start to tackle here other than to say that blocks of compensating delays have to be distributed throughout the signal path. The manufacturers have to figure out how best to apply delays in the signal paths of their products so that all inputs to all mixed busses remain synchronized to within a half sample or better.

How do you, as an engineer, cope with the added burden of having to manage everything along this new dimension of time - something that you never had to think about nearly as much when working on analog desks. Do you make the insert and then dial a delay number into a large number of inputs? Do you have a certain amount of delay in place on all channels all the time and then dial out delay on the channel you just added the analog loop to? Can this whole process be automated so you don't have to think about it at all - an automated delay management system... wouldn't that be nice?

In an automated system - what has to be factored in?

It is probably unrealistic to expect any automated delay management system to account for items external to the console system itself. Though, with some form of user defined database of delay times per outboard piece, this is something that would not be impossible. A scheme that could do the same thing might include a pulse generation and measurement routine that would allow you to "zero out" these delays if you want to.
An automatic delay management system should, at least, be able to keep all inputs synchronized to each buss regardless of where the buss "picks off" from the path and regardless of what processes are in or out, active or not active. This means that blocks of delay need to be scattered strategically though the signal/processing path of each channel so that if a process is removed a delay block can "step in" and replace the delay formerly occupied by the process. There are many ways of accomplishing such a delay management goal.
Loops to the outside world should be timed to the ports and loop timing can be taken as a send-return port to port connection via a wire. This way the console system itself is not the cause of delay variations track to track. This is possible because generally these delays are a known quantity.
If a console channel is delayed because the outboard device has significant amounts of delay in it the user then might have the option to grab a knob and dial that channel back in to sync if he so desires. This would be a manual override or trim function.

Maybe the engineer does not want delay time to be managed this way all the time so it should be possible to turn off the automatic delay management system and put it under manual control. There could be an option that defines a maximum delay time. There could be a way of taking a channel or group of channels out of an automatic delay management system so they can be put under manual control or under a "shortest time" mode. There are a number of variations on a theme... Synchronized Groups for example... It should be possible to have a number of operating modes that would make the console work for everyone depending on personal preference and application.

3. Harmonic Enhancement is really an extension of item 2 above and discusses a feature that does not exist on every digital console. The analog process outlined here takes two channels on most analog desks but there is no reason this could not be done within one channel using a send and return loop as above with a "blend" control on the return.

Harmonic enhancement is a very common effect and appears to be used more and more as audio mixing systems and outboard equipment becomes cleaner and cleaner. As an engineer/producer you might want to send that snare drum track out to a harmonic generating processor like one of several Aphex models, an EXR exciter, a BBE or even a compressor or limiter driven non-linear. Those devices generate odd and/or even harmonics that you want to bring back and blend in with the snare to give it some top or "bite". There is even a digital box from Crane Song that does this. Most engineers prefer to do the blending of dry with harmonics in the console and not use the mixing feature offered in most of these processors. They have more control this way. The effect only works when there is insignificant time delay between the original signal and the harmonics generated from it in the outboard box. A delay management system has to be able to synchronize a "split and remix" path so that the original signal and the processed signal can be brought back together with no relative delay... and without retarding the whole track relative to other tracks.

4. The Overdub is something that may appear to be impossible to get in time with the rest of the tracks on a multitrack when using a digital console. The tape plays and a cue mix is generated to a pair of headphones 2.5 milliseconds later and the new material is mic'ed and takes 4 more milliseconds to get back to the tape machine. The tape has gone by - the new material is going to go on to tape after the tracks that generated the cue mix have moved off the sync head. How do we manage that delay? First, we need to look at a couple things.

Put yourself in the position of the perfect percussionist.

Let's assume that we have an analog foldback path so you hear yourself with no significant delay. In additional let's assume that the tracks you are playing your perfectly timed percussion to are coming off the multitrack and mixed to cue in a digital console with a delay management system that works. It should be that the time from the output of the multitrack to the cue mix feed jack on the patchbay is known. That is part of what a delay management system should be keeping track of.

You hear the cue track you are playing along with 2.5 milliseconds after it has left the tape machine. You hear yourself with no delay because of the analog foldback self-monitoring system (above). You play your perfect percussion overdub and it goes into a microphone. Let's ignore acoustical delay by saying that whatever it is you are playing needs the microphone right on it or, better yet, the microphone is the same distance from the instrument as your ears are. You are USED to playing this way in an analog setting.

The sound/signal travels through the microphone, preamp, a STA-Level, a Pultec and 10 microseconds later it is at the A/D converter input. 4 milliseconds later it arrives at the tape machine. Too LATE! You are going to tape 6.5 milliseconds after the tape and all the tracks you were playing to have gone by. At 30 ips the tape has moved a little over 3/16" off the sync head so I guess we can't do it - too bad...

WRONG! - Of course

Here is where digital begins to save itself. It is perfectly true that an analog tape machine and a digital console together can't get an overdub perfectly in time. This is not the case if the multitrack is a digital tape machine like a Sony 3348 or a hard disc multitrack like a Euphonix R-1 or Fairlight Merlin. The reason for this - on a 3348 the heads are in this order... READ - WRITE - MONITOR. The write head is happening significantly later in time than the read head where the tracks going to the cue mix are read. Remove all the delay from the read head to the machine output and the problem is reversed. At 6.5 milliseconds the audio has to be held until the tape approaches the write head. We know that digital consoles can do delay - they can't help it. Assume the delay management system "knows" the delay time from the point where the advanced signal leaves the machine and enters the console and assume it "knows" when, in time, relative to that, the signal needs to hit machine so it writes to tape in sync. The console essentially replaces the delay in the 3348 with it's own delay so that a "zero time" overdub can be accomplished. There is a maximum amount of delay in a tape machine beyond which you can not loiter.

Hard disc recorders have more latitude. I suppose that a good hard disc recorder with data taken off far enough "in advance" would have made even the AT&T Disq core work for overdubs. Glenn Meadows would be the right person to ask but my recollection is that the AT&T console had an enormous throughput delay. It had synchronized channels and it had a delay management system. All it needed was an analog foldback system for self-monitoring and a hard disc recorder that could put out the tracks significantly advanced in time to have overdubs "go down" in sync. Of course, we are talking virtual tracks here but the analogy is perfectly valid.

Summary - Checksum Bits

The fact that digital consoles have delay is not at issue. The delay times can be shortened but it will be some time before that delay time becomes insignificant as it is in analog. Until then delay times have to be managed - preferably in a way that does not put yet another task on the engineer and/or producer who, after all, have to think about the actual performances and production values and don't really need to think about numbers. It is a quite well known that most engineers and producers would prefer to have no need to use the numerical/computational side of their brain at all - at least not while they are recording, coaching performers to get the best take out of them, doing arrangements, overdubbing, being creative... and all those kinds of things that console manufacturers' design departments seem to have completely forgotten about.

Note that up to this point I have completely avoided talking about sound quality, musicality, specifics of control surface layout, jitter, word length, sync references, types of EQ, dynamics and other processing "modules", sample rate (well - a little) and a whole host of other things that other people spend a lot of time on. The reason I did not mention any of that is precisely because other people are talking about these things. This delay issue has come up time and again and every time it has been somehow put aside as a non-issue.

This is not a non-issue. I can say this because I have clients that have large format digital consoles now and none of these delay issues are addressed in them at all. For some, it is not an insurmountable problem but for others, it renders the consoles practically unusable. One prominent engineer referred to one of these nearly million-dollar consoles as "a boat anchor" largely because of the delay issues I have outlined above. This is a console that was introduced in 1997 that is now called a boat anchor. Do commercial recording studios want to make large investments in consoles that might become more suitable for use as nautical equipment well before it has paid for itself performing its intended function?

Eventually the topic of delay management in large format digital music consoles is going to come back and NOT go away. When this happens (I hope it is NOW) manufacturers will be forced into facing it square on rather than laughing it off or rationalizing it away one more time. When it comes back to stay and people start really taking it seriously the manufacturers who have not done anything about delay management will have to get motivated or loose some or all of their prospective customer base for these products.

Post Processing from the DRAFT Version 0.9.2

This document had been circulated in draft form for review and comment... There were some e'mail exchanges that I tagged that follow... Names are changed to protect the innocent...
---------------------------

To: Fred
From: John Klett <contacto2016@technicalaudio.com>
Subject: Re: [Fwd: Hi - Delay Stuff]

At 6:58 PM -0400 11/3/99, Fred wrote:
>Klett wrote:
>> In digital consoles we have another thing all together. The same situation where you route the snare out to an analog processor and back can take between 1200 and 1800 microseconds or longer ... >>snip<<

>[Fred] though the absolute delay will be a problem beyond a certain point, the mixture of this channel with the bleed channels is not an issue if I'm reading you right, as the bleed channels are subject to the same delay. Only the absolute delay in relation to the original source (once again headphones vs. open air) is the problem. To put a bit more perspective on it, the airborne delay might swamp this out anyway.

[Response from Klett]

In the snare drum "route out to an effect and back" example, that snare drum track that you are effecting will be moved back in time and lag relative to the other tracks that are not being routed in a similar way. This will cause a change in the "feel" of the tune - assuming the snare is a key player in establishing "feel" or "groove".

That snare track, when interacting with other tracks that have that snare bleeding on to them, may show a tonal change because of its change in time as well. In multi-mic situations there all kind of phase interactions so these tracks all contribute to the particular sound of each element. By shifting the snare back by a millisecond or more in time you are changing all those phase relationships that add up to make the snare as heard in the track... it's like the mic moved . If every drum mic is totally isolated from every other drum mic then you are still changing the feel... but throw up a pair of overheads and you have the problem of tonal changes as well - just another thing that happens that you may then have to deal with in some way.

Smaller delays will hit the first 180 degree rotation at a higher frequency than longer delays and if the snare is really only making noise up to say 8KHz then a delay of 30 microseconds will start to attenuate the very top end of that (90 degree shift at around 8KHz). A 1.5 millisecond delay will put that first 180 degree rotation at 330Hz - a bad place if you have a snare with some meat in it and the overheads are heavy with snare.

So the snare drum example is isolated from the self monitoring example. Drummers are at an arm and a stick's length from the drums anyway and sound, travelling at roughly 1100 feet per second takes a few milliseconds to get directly to her ears. She can cope with that and has all kind of other sensory feedback to tell her when she is placing herself in time with everybody else.

The self-monitor delay is most critical for vocalists - that is why I gave that example for that aspect of the delay problem.

>[Fred] Also, a matter of balance, a description of the 3348 tied to an analog console should be mentioned. The delays thru that machine don't seem to have affected the ability to make records... THOUGH the delays thru the 3M digital DID!!

[Response from Klett]

I have mixed data on analog input monitor delay through a Sony 3348. ... one measurement - done several times because it was hard to believe - put the analog E to E at 4 samples. This is about 84 microseconds and would not really change the "tone". Figure "tone" as living somewhere below 1200 Hz (and that is probably high). A 90-degree shift at 1200 Hz would require a relative time shift between the delayed path and the direct path of five times that amount - around 420 microseconds. The Sony does have this [84 microsecond] delay but that alone is not enough to fuck up the cue sound.

>[Fred] Generally agreed with the opinions here... I would have gotten more into the specifics of delay management, not because its our responsibility, but only that if given too much latitude in addressing the problem, they [the manufacturers] might either miss the point or create other problems..

[Response from Klett] Well yeah - I only have so much time to churn this crap out and I really have to look at the clock all the time what with a house to finish paying for and kids and all...

>[Fred] BTW - [Z Systems] is advertising their SRC as 24 bit now... the new DA88 AES interface has SRC's in it, with a novel little switch and light that says "DIRECT" to turn them off.

[Response from Klett] I have that Z-Sys box. I mean I specified it and put it in at ... one of the studios I support in New York. It uses the Crystal part - the CS8420. That - unlike the 1890 - takes in all 24 bits with no truncation. All the SRC ASICS will loose bits in the up-filter-down process due to rounding but at least it does not start by slicing off 4 bits like the AD parts do. The [Crystal] chip also has a full bypass that connects to 3wire port or AES receiver directly to the outputs and there is a pretty good dithering doodad to get you down to shorter words at the output. All in all a cool chip for what it is. I'll be putting a lot of those in until something better comes along.

>[Fred] When you make the Band-Aids to [your client's] cue, where are you going to get the post mic pre, pre A/D converter signal from? Outboard mic preamps? not the [console]'s?.. What about timing delays (absolute and relative) between different A/D converters? I hear the Apogees are a pig on delay.

[Response from Klett]

Most good converters are time hogs.

The direct analog foldback path has to take the mic signal at the last point before you hit the console (or outboard) ADC.

[Client] uses outboard preamps, dynamics and EQ to make vocal and other overdub chains that then hit the ADC. We take the last analog output hole on the bay and split that off - for now - to ONE Mackie 1202VLZ.
---------------------------

Post Processing for Release Version 1.0

This paper was started nearly a year ago when I was looking into specific problems one of my clients was having with his recently delivered large format digital console. In the last year that particular manufacturer has been quite diligent in dealing with many of the issues I and others identified but not all of them have been addressed and it appears at this point that some remaining things won't be resolved any time soon. This may be true elsewhere and it is a fact that a number of studios, commercial and private, that have large format digital consoles are looking to other manufacturers for solutions today. In several cases we are now seeing a few large formal digital consoles leaving facilities after only a year or so in service to be replaced with analog consoles or perhaps another digital product from another manufacturer.

Not everyone is finding delay and latency a crippling problem. Delay and latency management is not the only issue that has yet to be properly addressed in digital consoles. It is not one manufacturer who is finding their console leaving a facility barely a year after commissioning to be replaced by something else.

Why buy a digital console at all? There are many very solid reasons for doing so. One is the handling of surround in digital consoles is far better and more useable than in analog consoles. Another is complete dynamic automation of nearly all functions and full recall/reset/snapshot of previous mixes. A third has to do with configurability and choices. Digital consoles make it possible to offer up a variety of choices in EQ and Dynamics processes (types/algorithms) and change the order of processes, routing and so on to suit the needs of the engineer. Not every digital console takes advantage of this but the fact that the bulk of what you can and can't do, the EQ and dynamics and buss structure in a given analog desk is hard wired with a limited amount of switching available to provide flexibility one can see the potential in the (largely) software defined digital desks. In analog studios this kind of flexibility exists in the form of a patchbay and loads of outboard equipment but act of recalling an analog session vs. a digital session in a well implemented large format console is quite a different thing... so it comes down to usability and productivity.

For my money (assuming I would have money to spend on a studio) I would still tend toward an analog console for tracking and overdubs but once the tracks are largely built I would lean strongly toward a digital console for overdub and mixing - especially if surround was an issue - and it will be. The kind of music I would be recording and mixing could be done in that two step process - tracking and mixing - however other producers and engineers are essentially mixing right from the start and if they want to have that mix evolve smoothly from the first tracking/programming session through to the end then they need to stay on the same console they started with. If they have an analog console they will have all the recall issues to deal with as they move from song to song. Surround mixing and the lack of automation everywhere except faders and maybe some events will limit what they do. On the other hand a digital console with a flexible signal path and choices in types and flavors of processing available within the console make the changeover from song to song in a project easier and the whole process more fluid from beginning to end. Here is where automated delay management solutions to the examples cited in the paper above really come in to play.

Other things regarding time:

In large MIDI systems the timing issues are such that differential delays imposed by digital console are not the larger problem. You will see a similar thing in the larger multitrack workstations that use "plug-in" processes. In both of these cases you have other factors in addition to the console and in both cases individual tracks can be slid in time to even things out in time.

MIDI controlled synths and sampler tracks tend be isolated from each other so phase shifts track to track that would occur with ambient room mics and multi-mic drum recordings are not really a problem. Often the most you have to cope with in these systems are stereo drum or piano samples and those are almost always processed as a pair and share a common path delay time. If the MIDI timing issues were to vanish then the console and differing input to mix path times would become significant.

In Digital Audio Workstations that use plug-in technologies such as ProTools or CueBase VST- that I refer to collectively as "Virtual Studios" - there are a number of timing issues that I and many others have found problematic and at some point I want to investigate these further and quantify them. There are time shifts that happen when shifting program to and from plug-in processes and in the latency of the processes themselves. There are applications that are supposed to tell you what a given plug-in is doing in terms of latency so the track could then be slipped forward to compensate... but we quickly found that the numbers generated by these plug-in timers are rarely accurate or consistent with actual timing changes. In larger systems it appears that there are various track to i/o delays depending on which i/o and which hard disk is used. In PCI card based systems this could be attributed to the time it takes to ship data from card to card and from card to hard drive/buss/interface. There may be six or seven cards with processing and i/o and two active HD drive interface busses so there are quite a number of combinations with differing transit times - this is probably why plug-in timers are not giving us real numbers.

At this point in time - a year after I wrote the draft of this paper and discussing delay and latency issues with manufacturers on an ongoing basis - we still find that the Sony Oxford R3 is the only Large Format Digital Audio Console for Music that has an automatic delay/latency management system that is in place and working. Other manufacturers have recently implemented manual delay compensation schemes that involve assigning a starting packet of delay to all channels as part of a starting console template. Some consoles have a separate packet of delay in the signal "path" and others use delay incorporated into another processes like dynamics. These manual "all channels have a delay pad that I can add to or remove time from in samples" approaches are simply formalized versions of what engineers have already figured out as a work around. This "formalized Band-Aid" does not automate the process and pop the correct amount of delay in or out as insert loops are engaged. While manual delay compensation are better than none at all they put the actual burden of dialing in the right delay time and keeping everything synchronized (at a sample level) on to the engineer and it really should not be so. The engineer should be able to adjust delay times if he wants but he should not forced to deal with all the delay issues all the time if he does not want to deal with it. Engineers have lots of other things to do and think about already. I suggest that the manufacturers who are not working on some form of AUTOMATIC delay management system for their consoles and who may have opted for a manually adjusted delay pad per channel approach should consider that a stop-gap measure.

I may add some more stuff to this but that is it for now

In our rush to new technologies, we often leave good things behind. After some time, we figure out that we lost something. What was once considered crap is resurrected as classic. I often cite the introduction of the digital numeric readout watch as an example. At the time these things came out Kojak was on TV wearing a gold Accutron watch - very stylish. It ate four mercury cells a week and it had BIG RED LED DIGITS. Soon everyone wanted a watch with a numerical readout and some spent upwards of $600 for the first ones. They were eventually produced in quantity, the price dropped and features were added. Most everyone tossed their old analog watches into a drawer and donned new digital watches. It turned out - for me - that figuring out how much time I had before my 4:00 booking took a bit more effort. I used to able to just "see" how much time I had left with the analog dial watch. With the numerical watch, I had to do math. Many people went back to wearing analog dial watches again or they bought ones with both the analog dial and a digital readout. Some people prefer numbers. Others have hybridized.

---------------------------

Delay in Large Format Digital Music Consoles
by John Klett (c) 1999-2009 by John Klett, Carmel, NY
http://www.technicalaudio.com

Release Version 1.0 - 12 September 2000

(This article may be reproduced for noncommercial purposes only if it is copied in its entirety, including this notice and all notices of copyright and authorship contained on the originating web page at www.technicalaudio.com and in the article itself - information directing readers to the TechnicalAudio.com web site would be appreciated - (c) 1999-2011 by John Klett, Carmel, NY... CONTACT - John Klett

END