[mus422] peak amplitude in the MDCT spectra

Craig Sapp craigsapp at gmail.com
Tue Mar 9 20:43:00 PST 2010


Hello Music 422 Class,

My HW3 solution for the DFT is exact (but you need to zeropad or do
something else to get fractional peak amplitudes correct).  But here is a
demonstration of how my solution can go horribly wrong in the MDCT (well,
at least not optimal). The problem is that the peak behavior of a signal is
dependent on the phase of the sinusoids in the MDCT (In the DFT, the peak
amplitude is invariant of the phase).

==========================================================================

from pylab import *
from numpy import *
from yourMDCTsolutions import *
from yourWindowSolutions import *

N = 1024
binoffset = +0.5      # MDCT Frequencies are a half-bin shifted up
from FFT locations
binfreq = N/4          # target frequency bin in the MDCT

# cosine wave
phase1 = 0.0
signal1 = 1.0 * cos(2*pi*arange(N)*(binfreq+binoffset) / N + phase1)

# intermediate phase
phase2 = pi/4
signal2 = 1.0 * cos(2*pi*arange(N)*(binfreq+binoffset) / N + phase2)

# sine wave (multiplied by -1)
phase3 = pi/2
signal3 = 1.0 * cos(2*pi*arange(N)*(binfreq+binoffset) / N + phase3)

windowfactor = N / sum(SineWindow(ones(N)))

spec1 = MDCT(SineWindow(signal1))  * windowfactor
spec2 = MDCT(SineWindow(signal2))  * windowfactor
spec3 = MDCT(SineWindow(signal3))  * windowfactor
clf()
plot(spec1, 'r--o')
plot(spec2, 'b--o')
plot(spec3, 'g--o')
axis([252, 260, -1, 1])
legend(("$\phi=0$", "$\phi=\pi/4$", "$\phi=\pi/2$"), loc="lower center")
title("cosinewave at frequency bin 256 in MDCT with various phases")
show()

# The Marina Method:
pow1 = spec1[binfreq-1]**2 + spec1[binfreq]**2 + spec1[binfreq+1]**2
### 0.22222352949106505
pow2 = spec2[binfreq-1]**2 + spec2[binfreq]**2 + spec2[binfreq+1]**2
### 0.60991814794236254
pow3 = spec3[binfreq-1]**2 + spec3[binfreq]**2 + spec3[binfreq+1]**2
### 0.99999895418519646

# Worst case decibel variation:
10 * log10(pow3 / pow1)
### 6.5320950476262096

==========================================================================

The cosine peak (red line in plot) is invisible to the MDCT at the
expected bin (256),
and the only thing the MDCT can see is the outside edges of the main lobe at
bins 255 and 257.  In contrast, the green line for the sinewave (pi/2
phase offset from a cosine) is maximally visible.  The blue line shows
an intermediate phase between cosines and sines.

There will still be a 6.5 dB variation between the measured peak
and actual signal amplitude (so about 0.5 amplitude scaling) in the MDCT
in the worst case, which is why the DFT is used to calculate the
masking curve :-).

The reason for this variation can be seen by the expanded form of the DFT:
   DFT(x) = DCT(x) + i DST(x)
Whatever the DCT cannot see, the DST (Discrete Sine Transform) can
and vice-versa. (Sinewaves are invisible to DCT, cosinewaves are
invisible to the DST, and both can see with complimentary amplitudes
all other phases).  The MDCT is a twisted version of the DCT.

My method in the worst case (red line) generates a double peak
which underestimates the original peak amplitude by 9.6 dB, and is off by
+/- one frequency bin from the true peak.:
   10 * log10(spec1[binfreq+1]**2)

-=+Craig
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mdctphases.png
Type: image/png
Size: 43801 bytes
Desc: not available
Url : http://cm-mail.stanford.edu/pipermail/422/attachments/20100309/b46ba858/attachment-0001.png 


More information about the 422 mailing list