[PlanetCCRMA] Re: Mulitple Jack audio interfaces (was Tascam US-122 audio problem)

Mark Knecht Mark Knecht <markknecht@gmail.com>
Sat Feb 26 16:00:01 2005


On Sat, 26 Feb 2005 15:10:08 -0800, Andrew Burgess <aab@cichlid.com> wrote:
> >> >>but with unmodified consumer PCI devices you have no
> >> >>source of sync, the drift will be substantially more than 1 sample every
> >> >>100k.
> >>
> >> >How much more substantial will it be? That seems like a high guess to me,
> >> >aren't crystals accurate to a few ppm?
> 
> >100-200ppm at the outset for inexpensive crystals, independent of
> >temperature drift and changes over lifetime.
> 
> >> I ran some overnight tests and found with 3 soundcards (2 usb and one mb)
> >> the rates differed by 2-4 samples/sec. About 10 times more than my 100k
> >> estimate (oops).
> 
> 4/48000 is 83ppm for my three card test case.
> This was using gettimeofday(2) and non-realtime priorities.

I have no idea if that's a valid way to do it but it's certainly in
the general range I quoted. I've bought crystals a number of times for
1394 boards we've designed and 100ppm is a pretty common price break
point.

> 
> >These crystals will change frequency slightly
> >over the day. They speed up and slow down.
> ..
> 
> Yes I know, that's why you must adjust dynamically.

NON-argumentatively =>?But how can you know? If you are using the
system clock to get time of day info then you are depending on another
crystal source which has another 100ppm bit of uncertainty. You don't
know if it's running fast or slow.  It all seems non-deterministic to
me. (That doesn't mean something might not work though!)

> 
> >The only thing I think you can do in this space is to count samples.
> 
> Yes
> 
> >If you know that all devices generate 44100 samples in whatever period
> >of time they consider a second to be then you could arbitrarily
> >designate one of them to be the master '44100 sample supplier' and
> >then *possibly* do some adjustment to your other sample streams based
> >on that.
> 
> I don't think of one being the master. I think you have a source and
> a destination for your samples and if the sample rates differ (thus they
> represent two different soundcards) then you adjust.

OK, so the fast side when receiving will delete samples and the slow
side when receiving will add samples. You might even make a decision
to not add samples until the offset gets larger than some amount so
that you're not doing this too often. You might also not add or delete
the total number of samples required every second but rather create an
offset direction that says 'You're 10 samples ahead right now so
delete 2 or your 50 samples ahead so delete 10'. That way the changes
happen a bit more randomly and maybe aren't so noticeable. Who knows?
You'd have to build it to find out. It probably wouldn't be any worse
than taking the transmitting side and running it through a D/A and
then back through an A/D on the receiving side. However many people do
that as it creates a sort of natural dither and even an analog warmth
to the audio. (As long as you are using top notch converters...not
sound card converters...)

> 
> >What that adjustment is would be complex as you would
> >sometimes be adding and sometimes subtracting samples.
> 
> What's complex? If you have a good number for the actual sample rate
> you just do the arithmetic, no? 

That precisely my point. I contend that you'll never have a good
number for the actual sample rate.

> Getting the actual sample rate
> seems the biggest challenge now. I still haven't looked at the timestamps
> available from alsa.

Alsa is using (or at least Jack is using) the sound card's crystal. Do
you see how incestuous this all gets? It just keeps wrapping around on
itself.

> 
> >It seems to me that this would invariably lead to clicks and pops, and
> >that should you choose to do it every second then you'll end up with
> >some repetitive click that's likely to be more noticeable.
> 
> Not every second, every N samples where N is the computed rate difference.
> So for a while (or once) it might be a insert a sample every 10,123 samples
> then for a while discard one after every 8,456 samples.

Yes, possibly...

> 
> >> I think I'll try generating some pure sin tones and see if I can
> >> audibly detect sample insertion/deletion, or maybe try to find the
> >> detection threshold.
> 
> This is the test I still have to run to determine if the corrections will
> be audible. I imagine if you can't hear it with a pure tone you won't hear it
> with complex material.

I think I agree, at least for a single pass in real time. However
please consider this case:

You transfer from machine A to machine B on day X. The recorded audio
after you processing is stored. You then record the same audio from
machine A on machine B on day Y. You then play the two audio streams
against each other. You almost certainly are going to hear phasing as
the differences on different days are not identical and the position
fo the audio in the stream is going ot move around. I think there are
many cases where this sort of problem will kill the use of this
technology, but that's just my thought.

> 
> I do appreciate your interest.

It's an intersting problem for sure. I hope you find an acceptable answer.

Cheers,
Mark