CompressiveDistortion.5
From CNBH Acoustic Scale Wiki
|
Roy Patterson
There is an authoritative and highly readable description of waveform distortion from the signal processing perspective in Hartmann (1997). There is also a useful Wikipedia entry under Distortion#Electronic_signals. The distortion products of current interest are those that are traditionally referred to as:
- Quadratic difference tones of the form f2-f1
- Cubic difference tones of the form 2f1-f2
- Quadratic harmonics of the form 2f1
- Two-tone suppression which refers to the level of 2f1-f2 in the cochlea
Contents |
Quadratic difference tones, QDTs
These difference tones are not currently in the news, perhaps because the magnitude of the quadratic distortion component produced by a pair of high harmonics is fairly small at the difference frequency. But the pulse-resonance sounds of speech and music contain many adjacent harmonics and each pair of adjacent harmonics adds a component to the distortion product because they are largely in phase. I have attached pdf’s of two fairly recent studies (Pressnitzer and Patterson, 2001; Wiegrebe and Patterson, 1999) which show that quadratic distortion components make a significant contribution to the distribution of activity in the F0 region of the cochlea. Like cubic distortion components, they are not disruptive in perception, Indeed, they give the low-frequencies in the sound a boost.
Cubic distortion tones, CDTs
These are normally studied with carefully contrived pairs of sinusoids. They are useful for helping us understand cochlear mechanics. They are of little practical importance in machine hearing because they are at best 15 dB down and they typically arise at frequencies where there is already energy. If the sound is speech, music or environmental noise, all they do is make small adjustments to the amplitude and phase of existing components of the sound. I imagine your model produces cubic distortion components and that they propagate to the correct place. This would be good to demonstrate with a pair of sinusoids because it is an inherent property of the cascade filterbank that does not arise in parallel filterbanks. It is an emergent property of the system and costs you nothing in terms of parameters or computational load.
Quadratic harmonics
These have the form 2fi that are notable by their absence. They become audible at levels above 95 dB SPL, but they are not heard at normal listening levels. They are even-order distortion components just like difference tones and they come from the same terms in the Taylor series as those that generate difference tones. They arise an octave above the component that produces them but in many cases that would not be further from the site of generation than difference tones. They would not cumulate the way difference tones do but in parallel filterbank systems there is no reason for them to be much smaller in magnitude than quadratic difference tones. The fact that they are not observed is because propagation is highly asymmetric in the cochlea. Since this in an inherent property of CAR-FAC systems, I suspect that your model can explain the asymmetry between the levels of quadratic difference tones and quadratic harmonics as well.
Two-tone suppression, 2TS
Two-tone suppression involves distortion of level rather than frequency, and it is a hot topic in cochlear mechanics. It is two-tone suppression that necessitated the change in the level control mechanism of the compressive gammachirp (Irino and Patterson, 2006) and the very short time constant (0.5 ms). This phenomenaon suggests that the symmetry of the AGC network in the original version of the PZFC is probably not correct, and the fastest time constant is probably too slow. Walters (2011) experimented with asymmetric versions of the AGC network and faster time constants, but in the context of pitch strength rather than 2TS. The kind of demonstration that is required is illustrated in Irino and Patterson (2006)
References
- Hartmann, W.M. (1997). Signals, Sound, and Sensation. (AIP Press). [1]
- Irino, T. and Patterson, R.D. (2006). “A Dynamic Compressive Gammachirp Auditory Filterbank.” IEEE Transactions on Audio, Speech, and Language Processing, 14, p.2222-2232. [1]
- Pressnitzer, D. and Patterson, R.D. (2001). “Distortion products and the pitch of harmonic complex tones”, in Physiological and Psychophysical Bases of Auditory Function, Breebaart, D.J., Houtsma, A.J.M., Kohlrausch, A., Prijs, V.F. and Schoonhoven, R. editors, p.97-103 (Shaker). [1]
- Wiegrebe, L. and Patterson, R.D. (1999). “Quantifying the distortion products generated by amplitude-modulated noise.” J. Acoust. Soc. Am., 106, p.2709-2718. [1]