[CM] removing harmonic content from sound (solutions for real-time use)

Sat Aug 19 09:01:44 PDT 2023

Hi All, sorry for not chiming in before, I'm in NY sending my son off to
college and have been very busy. Here are are some ideas regarding for
Anders request:

1) if you are working with harmonic instruments/sources, you could do this
by tracking the F0 and using a bank of notch filters, similar to the way
hum memoval algorithms work. I got good results doing this in real time in
SC with relatively stable harmonic sound sources.

2) You could also do the same as in #1 using the FFT. You would still need
to detect the F0, then remove (or attenuate) the bins of the magnitude
spectrum corresponding to the harmonic spectrum, and finally use the IFFT
to resynthesize the sound. This should be done on a frame by frame basis
using OLA.

3) If your sound doesn't have a harmonic spectrum (or you don't know) you
could use a technique similar to #2 except that instead of F0 tracking you
do peak detection on the magnitude spectrum, then once the main peaks are
detected (you could use a threshold in dB to define them) you remove or
attenuate them from the magnitude spectrum and then use then use the IFFT
to resynthesize the sound. This is done on a frame by frame basis, using
OLA, no tracking is required. You can look at the peak-detection algorithm
used in ATS (or SMS for that matter). I had some very good results using
this method in real time with strings and wind instruments.

4) Same as #3 but doing peak trekking and resynthesis. In this case you
could try to track the peaks across a certain number of frames, then remove
them from the FFT magnitude spectrum before resynthesis. This could work
fine if you track stable sinusoids over a few frames, you could try the ATS
tracking algorithm for this method.

5) you could also use a frequency and time-domain algorithm with #3 and #4,
using phase information. In this case once peaks are detected you would
remove them by synthesizing them using the phase information and then
remove them by subtracting them from the time domain signal (frame). For #4
this would require phase tracking using high-order interpolation but it is
doable in real time. This residual generation method is the one used by ATS
(and SMS) offline.

OK, I hope this is helpful.
Cheers,
JUAN

On Thu, Aug 17, 2023 at 4:58 AM Orm Finnendahl <
orm.finnendahl at selma.hfmdk-frankfurt.de> wrote:

> Hi Anders,
>
> Am Mittwoch, den 16. August 2023 um 22:50:30 Uhr (+0200) schrieb
> Anders Vinjar:
> >
> > My problem is the real-time requirements (no off-line pre-analysis), and
> > not-very-large latency.  Do you think ATS could provide solutions for
> > those?
>
>  I don't think so.
>
> > The Median Filtering approach of FitzGerald is perfect!  But needs
> > several frames to track partials to distinguish from non-harmonic
> > content (as ats etc.).  Probably as close as i'll come...
>
> I completely agree. Although I heard very convincing AI based
> extractions of drum parts from instrumental tracks and imagine this
> could work very well for your application case, I would expect you
> need a fair bit of training data and expertise to make it work.
>
> I'd be quite interested to hear about your experiences and results
> (just post offline).
>
> --
> Orm
> _______________________________________________
> Cmdist mailing list
> Cmdist at ccrma.stanford.edu
> https://cm-mail.stanford.edu/mailman/listinfo/cmdist
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://cm-mail.stanford.edu/pipermail/cmdist/attachments/20230819/441efa2f/attachment.html>