Category:Auditory Processing of Communication Sounds

From CNBH Acoustic Scale Wiki

Jump to: navigation, search
Introduction to the content of the wiki

Auditory perceptions are constructed in the brain from sounds entering the ear canal, in conjunction with current context and information from memory. It is not possible to make direct measurements of perceptions, so all descriptions of perceptions involve explicit, or implicit, models of how perceptions are constructed. The category Auditory Processing of Communication Sounds focuses on how the auditory system might construct your initial experience of a sound, referred to as the 'auditory image'. It describes a computational model of how the construction might be accomplished -- the Auditory Image Model (AIM). The category Perception of Communication Sounds focuses on the structures that appear in the auditory image and how we perceive them. These categories are intended to work as a pair, with the reader going back and forth as their interest shifts back and forth from the perceptions themselves and how the auditory system might construct our perceptions.

Roy Patterson , Tom Walters

Contents

Introduction

In this auditory processing category, it is assumed that the sub-cortical auditory system creates a perceptual space, in which an initial auditory image of a sound is assembled by the cochlea and mid-brain using largely data-driven processes. The auditory image and the space it occupies is analogous to the visual image and space that appear when you open your eyes in the morning. If the sound arriving at the ears is a noise, the auditory image is filled with activity, but it lacks organization and the details are continually fluctuating. If the sound has a pulse-resonance form, an auditory figure appears in the auditory image with an elaborate structure that reflects the phase-locked neural firing pattern produced by the sound in the cochlea. Extended segments of sound, like syllables or musical notes, cause auditory figures to emerge, evolve, and decay in what might be referred to as auditory events. All of the processing up to the level of auditory figures and events can proceed without the need of top-down processing associated with context or attention. For example, if you are presented with the call of a new animal that you have never encountered before, the early stages of auditory processing will still produce an auditory event for the sound, even though you have no context for the sound and might be puzzled by the event. It also seems likely that the initial stages of processing operate as normal when you are asleep, so auditory figures and events are produced and exist in the auditory pathway when you are asleep. The Auditory Image Model is intended to simulate the neural processing involving in constructing our initial auditory images of sounds without reference to the context in which they occur or our memory of similar events.

The main focus of the category Auditory Processing of Communication Sounds is the form of our initial auditory images of sounds: how they are constructed from the neural activity pattern (NAP) flowing from the cochlea, and the events that arise in this space of auditory perception in response to communication sounds. It appears that the space of auditory perception is rather different from the {time, frequency} space normally used to represent speech and musical sounds, and the auditory events that appear in the auditory space are very different from the smooth energy envelopes that represent events in the spectrogram. Briefly, it appears that the space of auditory perception has three dimensions, linear time, logarithmic scale and logarithmic cycles, which will be explained below. The {log-scale, log-cycles} plane of the space is obtained through a unitary transform of the traditional {linear-time,linear-frequency} plane, and the auditory figures that appear in this new plane have the property of being scale-shift covariant (ssc), with regard to both resonance rate and pulse rate. Moreover, the three forms of information in communication sounds are largely orthogonal in this plane. The advantage of the auditory space is that the message of a communication sound appears in a form that is essentially fixed, independent of the pulse rate and the resonance rate of the sound that conveys the message. It also appears that scale-shift covariance of this form is not mathematically possible in a {linear-time, linear-frequency} representation like the spectrogram. If this is the case, then it is important to understand scale-shift covariance and the space of auditory perception in order to improve the robustness of computer-based, sound processors like speech recognition machines and music classifiers.


Papers in preparation

The processing of Temporal Fine Structure  Ambox warning pn.svg Access to this page is currently restricted

Excerpts from published papers

Research projects

The Pole-Zero Filter Cascade Ambox warning pn.svg Access to this page is currently restricted

AIM2006 Documentation

Published papers for the Category: Auditory Processing of Communication Sounds

Auditory filter banks: Patterson et al. (1995), Irino and Patterson (1997), Irino and Patterson (2001), Patterson et al. (2003), Unoki et al. (2006), Irino and Patterson (2006)

The construction of auditory images: Patterson et al. (1992), Patterson (1994a), Patterson (1994b), Patterson et al. (1995), Patterson and Holdsworth (1996), Bleeck et al. (2004), Patterson et al. (2006)

Invariant and scale-shift-covariant versions of the auditory image: Irino and Patterson (2002), Patterson et al. (2007), Irino et al. (2007)

Damped and ramped sounds in the auditory image: Patterson (1994a), Patterson (1994b), Irino and Patterson (1996), Patterson and Irino (1998), Akeroyd and Patterson (1995), Akeroyd and Patterson (1997), Uppenkamp et al. (2001), Lorenzi et al. (1997), Lorenzi et al. (1998), Pressnitzer et al. (2000), Neuert et al. (2001)

Pitch producing sounds in the auditory image: Patterson et al. (1996), Yost et al. (1996), Yost et al. (1998), Patterson et al. (2000), Handel and Patterson (2000), Winter et al. (2001), Wiegrebe et al. (2000), Wiegrebe and Patterson (1999), Wiegrebe et al. (1998), Stein et al. (2005), Krumbholz et al. (2000), Pressnitzer et al. (2001), Krumbholz et al. (2001), Krumbholz et al. (2003), Ives and Patterson (2008)

References

Personal tools
Namespaces
Variants
Views
Actions
Navigation