Family and Register in Musical Tones
From CNBH Acoustic Scale Wiki
Roy Patterson , Etienne Gaudrain, Tom Walters
This Chapter is about the sounds made by musical instruments and how we perceive those sounds. The Chapter is intended to explain the basics of musical note production, such as, why a particular instrument plays a specific range of notes; why instruments come in families; and why we hear distinctive differences between members of a given instrument family - even when they are playing the same note. The chapter focuses on the relationship between the physical variables that determine how musical instruments produce their notes (like the length, mass and tension of a string) and the acoustic variables of sounds (like intensity and frequency). The discussion reveals that there are three acoustic properties of musical sounds, as they occur in air, that are particularly useful in (a) summarizing the action of the physical properties of instruments and their limitations, on the one hand, and (b) explaining the dimensions of musical note perception, on the other hand. Accordingly, the first section of the Chapter sets out the aspects of note perception to be explained. The second section describes the three important acoustic properties of musical notes as they pertain to note perception. The third section explains the relationship between the physical variables of note production (length, mass, tension, etc) and the acoustic variables observed in the sounds. The final section presents a set of novel melodies to illustrate the interaction of the two acoustic scale variables in determining the qualities of the instrument or voice producing the musical tones.
Contents
|
Instrument families, registers within families, and melodies
The discussion of note perception focuses on the sounds produced by the sustained-tone instruments of the orchestra and chorus, that is, the families of instruments referred to collectively as brass, strings, woodwinds, and voice. Table I shows four of the instruments in each of the families, ordered in terms of their register, or voice. With just a little training, most people can identify these sixteen instruments from a simple melody (van Dinther and Patterson 2006). From this perspective, the purpose of the chapter is to explain how auditory perception enables us to make these distinctions.
When members of different instrument families play the same note - say a trombone, a cello, a bassoon, and a baritone all produce their version of C3 (the C below middle C on the keyboard) - it is the pitch of the notes that is the same and it is the timbre of the notes that is different. This is the traditional distinction between the perceptual variables pitch and timbre. The instruments of a given family have similar physical shapes, they are made of similar materials, and they are excited in similar ways, so it is not surprising that the instruments of a family produce a category of notes with a distinctive, unifying, sound quality, or timbre. The categories of timbre associated with instrument families are labelled with words that describe some physical aspect of the source. So, the trumpet is a brass instrument, the clarinet is a wood-wind instrument, and the violin is a string instrument. The family aspect of timbre is largely determined by the shape of the envelope of the magnitude spectrum and the pitch is largely determined by the repetition rate of the note. So, these aspects of note perception would appear straightforward. Within a family of instruments, the different members are distinguished physically by their size and perceptually by their register: soprano, alto, tenor, baritone or bass. Register is a very important aspect of note perception, but it is not as straightforward as pitch and instrument family because there are two separate aspects of instrument size involved in determining the perceived register of an instrument.
The first aspect of instrument size associated with the perception of register is the size of the source that activates, or excites, the instrument. In the string family, for example, it is the bow pushing on an individual string that excites the instrument, and it is primarily the length, mass, and tension of the string that determine the note that is produced. Together, these physical variables determine the repetition rate of the vibration that the instrument produces. Repetition rate is one of three acoustic properties that are important in musical note perception. It is a major determinant of the perceived register of an instrument, as well as the prime determinant of the pitch of a note. Register and pitch are two different aspects of note perception, and this is one of the distinctions that the chapter is intended to explain. At this point, it is sufficient to note that the musician varies the length of the string by depressing it against the neck of the instrument to vary the repetition rate and produce a melody. The three physical variables, length, mass and tension, are used by instrument makers to set the four open-string notes on each instrument, and the repetition rates of the open-string notes play a large role in establishing the register of the instrument.
The second aspect of instrument size associated with the perception of register is the size of the resonators in the body of the instrument; together, the resonators filter the pulse stream produced by the source. In the string family, the filter is primarily the bridge and face plate in conjunction with the volume of air in the body of the instrument and the f holes in the face plate. Acoustically, the resonances of the filter give the frequency spectrum of the note a distinctive shape which is closely associated with the sound quality, or timbre, of the instrument family. Spectral shape is the second of the three acoustic properties that are important in musical note perception. The spectrum also contributes to the perception of register. Within a family, if the spectrum is plotted on a logarithmic frequency axis, the distribution of activity shifts towards the origin without changing its basic shape, as the size of the instrument increases. Thus, the position of the spectral pattern contributes to the perception of instrument register, and the position of the spectral pattern is the last of the three acoustic properties that are important in musical note perception.
In general terms, then, the purpose of this chapter is to describe how the physical variables of note generation are related to the acoustic variables of notes as sounds, and how these acoustic variables are related to the perception of melodies, the perception of instrument family, and the perception of instrument register.
Pulse-resonance sounds and acoustic scale
The acoustic variables that are important to the discussion of music perception are conveniently illustrated with the sustained notes produced by singers, that is, sustained vowels. This also provides an opportunity to introduce the voice as a family of musical instruments. The waveform and spectrum of a synthetic /a/ vowel, like that spoken by a child, are presented in the upper and lower panels of Fig. 1, respectively. The waveform shows that a vowel is a stream of glottal pulses, each of which is accompanied by a decaying resonance that reflects the filtering of the vocal tract above the larynx. The set of vertical lines in the lower panel of Fig. 1 shows the long-term magnitude spectrum of the sound, and the dashed line connecting the tops of the vertical lines shows the spectral envelope of the vowel. In this example, the glottal pulse rate (GPR) is 200 pulses per second (pps), so the time between glottal pulses in the upper panel of Fig. 1 is 5 ms, and the spacing between the harmonics that form the fine structure of the spectrum in the lower panel is 200 Hz. The resonances in the spectral envelope are the formants of the vowel; the shape of the envelope in the spectral domain corresponds to the shape of the damped resonance in the time domain.
When children begin to speak they are about 0.85 meters tall, and as they mature their height increases by about a factor of two. The GPR of the voice decreases by about an octave as the child grows up and the vocal cords become longer and more massive. The decrease is greater than an octave for males and less than an octave for females; on average it is more than an octave but much less than two octaves. A person's vocal tract length increases in proportion to their height (Fitch and Giedd 1999; Turner et al. 2009 , their Fig. 4), and so the formant frequencies of children's vowels decrease by about an octave as they mature (Lee et al. 1999; Turner et al. 2009).
The effect of growth on the spectrum of a vowel is quite simple to characterize, provided the spectrum is plotted on a logarithmic frequency scale. In this case, the set of harmonics that define the fine structure of the spectrum simply moves, as a unit, towards the origin as the child matures into an adult, by somewhat more than one octave. In speech, the pattern of formants that defines a given vowel type remains largely unchanged as people grow up (Peterson and Barney 1952; Lee et al. 1999; Turner et al. 2009). So the shape of the spectral envelope does not change as a child grows up, the envelope just shifts towards the origin by about an octave, without changing shape.
These shifts in the fine structure and the envelope of the vowel do not change our perception of the vowel type; in the current example, the vowel remains an /a/, and does not change to an /e/, an /o/ or an /u/, as the size of the singer changes. The shifts do, however, have systematic effects on our perception of who is singing and what melody they are singing, so they are important variables in the perception of music. In acoustic terms, 'the position of the fine-structure of the spectrum on a logarithmic frequency scale' is the acoustic scale of the fine structure, and it is closely associated with the physical variables that control the source of excitation - the vocal folds - and the rate at which they vibrate. The position of the fine structure is a property of the sound as it occurs in the air (Cohen, 1993) - a property that appears in the time domain as the Glottal Pulse Rate (GPR) of the speaker. For brevity, it will be referred to as 'the scale (S) of the source (s)' and designated Ss. Similarly, in acoustic terms, 'the position of the envelope on a logarithmic frequency scale' is the acoustic scale of the set of resonances in the vocal tract that produce the formants in the spectrum; this is another property of the sound, and in this case, the most relevant physical variable is the speaker's vocal tract length (VTL). For brevity, this property will be referred to as 'the scale (S) of the filter (f)' and designated Sf. Turner et al. (2009) have recently reanalysed several large databases of spoken vowels and shown that almost all of the variability in formant frequency data that is not vowel-type information is Sf information.
When vowel type changes, e.g., from /a/ to /i/, the shape of the envelope changes, so the different vowel types are, in some sense, like different instrument families; however, the set of vowel types is like a cluster of families that is, nevertheless, distinct from the other musical families. In auditory terms, since different vowels can be sung by the same person on the same note, vowel-type is like family timbre, and since singers of with different vocal registers, e.g., alto and baritone, can sing the same vowel, vocal register is essentially the same as instrument register.
Pulse-resonance sounds and the source-filter model of musical sounds
This section of the chapter reviews the "source-filter" model of musical sound generation as it pertains to several families of musical instruments. The sustained-tone instruments of the orchestra produce pulse-resonance sounds whose basic properties are similar to the vowel illustrated in Fig. 1; that is, a source within the instrument produces a temporally regular stream of pulses and, then, resonators in the body of the instrument filter the pulses and introduce the spectral resonances that give the note, and thus, the instrument, its distinctive timbre. In this section, the source-filter model is used to illustrate the relationship between the physical properties of the instrument, on the one hand, and the three main acoustic properties of these sounds, on the other hand.
The source of excitation and the acoustic variable, Ss
In general terms, the source in these instruments is a highly nonlinear resonant system that produces a temporally regular stream of acoustic pulses similar to a click train. The systems themselves are quite diverse. For example, in the voice, it is the vocal folds; in brass instruments, it is the lips coupled to the mouth piece; in the woodwinds it is the reed coupled to the lips; and in string instruments, it is the bow coupled to one of the strings. Despite the diversity of mechanisms, all of the sources produce streams of very precise pulses and this virtually ensures that the waves produced by sustained-tone instruments are pulse-resonance sounds, like vowels. [In Fourier terms, the overtones of the pulse rate are locked to the pulse times both in frequency and phase up to fairly high harmonic numbers.]
The acoustic scale of the source of excitation, Ss, is the repetition rate of the wave as it occurs in the air between the instrument and the listener. Ss is determined by physical properties of the instrument, like length and mass, which are not themselves acoustic variables. Ss largely determines the pitch we hear, but Ss is not an auditory variable. It is an intervening, acoustic variable that describes a property of the sound in air, and it should be distinguished from pitch which is the auditory variable of perception. The relationship between Ss and the physical variables of the instrument will be illustrated by comparing how Ss is determined in the vocal tract and in stringed instruments.
The source of excitation in the human voice
The vocal folds produce glottal pulses in bursts and, although the vocal folds are rather complicated structures, the effect of the physical variables on the rate of pulses can be calculated using the expression for a tense string. The glottal pulse rate, GPR, is largely determined by the length, L, mass, M, and tension, T, of the vocal folds, and the form of the relationship is
(1)
Two of these physical variables are determined by the size of the person - the length and mass of the vocal folds. Both of these variables increase as a child grows up, and both of these terms are in the denominator on the right-hand side of the equation, so as the child increases in height the pitch of the voice decreases. The average GPR for small children is about 260 Hz, both for males and females. For females GPR just decreases with height throughout life dropping to, on average, about 160 Hz in women. For males, GPR decreases with height until puberty at which point the vocal folds suddenly increase in mass and the GPR drops to, on average, about 120 Hz in men. So it is the length and mass of the vocal folds that primarily determine whether a person is a soprano, alto, tenor, baritone or bass.
To produce a melody within their specific register, a singer varies the tension of the vocal folds. So learning to sing in tune is largely a matter of learning to control the tension of the vocal folds, holding the tension fixed during sustained notes and changing it abruptly between notes. Tension is in the numerator of the expression, and so as you increase the tension, you increase the GPR. There is considerable overlap in the note ranges of the vocal registers; in fact, the highest note of a bass is typically a note or two above the lowest note of a soprano. Finally note that the effect of all of these variables is restrained by the fact that it is the square root of the product of the physical variables that determines the specific GPR value. So, for example, a singer has to change the tension of the voice by a factor of sixteen to produce a two octave range of notes.
In summary, for a specific individual, their size, in the form of the length and mass of their vocal folds, determines their long-term average GPR, and the Ss component of the register of their voice. Varying the GPR to produce a melody involves varying the tension of the vocal folds, and thus, it also changes Ss. So the long-term average GPR, calculated over a sequence of musical phrases determines the register of the singer's voice, and short term changes in Ss are used to produce melodies.
The source of excitation in the string family
The excitation mechanism in stringed instruments is the string pushed by the bow. As the musician draws the bow across a string, the string is pushed or pulled away from its resting position until the tension becomes too great and then it snaps back to its resting position, vibrating in the process. The string and bow actually produce a coupled system that includes the musician, and over the first 50 ms or so of the note, the resonant properties of the string force the system to produce a standing wave. The shape of the cycle is unusual; it is the offset of each cycle that is abrupt rather than the onset, but the result is, nevertheless, a pulse-resonance sound. Although the bow-string system is rather complicated physically (McIntyre, 1983), the relationship between the rate of pulses and the main physical variables is the same as for the vocal folds, namely,
(2)
In this case, however, T, M, and L are the Tension, Mass and Length of the string, rather than the vocal folds. The two physical variables associated with the size of the source (the length and mass of the string) are the most important variables in this family of instruments and they each have two roles to play. Consider first the pulse rates of the open-strings on these instruments, and note that both the mass and length terms are in the denominator on the right-hand side of the equation, so increases in size, be they length or mass, lead to decreases in pulse rate. For a given member of the family (violin, viola, cello or contra bass), the length of the four strings is fixed, and as the size of the instrument increases, the string length gets longer in large steps. As a result, string length plays an important role in determining the register of the instrument. The mass of the string increases with its length, so it also contributes to the perceived register. But mass also plays another important role within the set of strings on an individual family member; the mass is varied across the strings to produce different note ranges on the four strings. Finally, the musician varies the length of individual strings to get the different notes within that string's range.
Instrument makers are very adept at using mass and length to vary the pulse rate of notes within a family. If a musician shortens the lightest string on the largest instrument, the contra base, to a point near the end of the neck, the pulse rate of the note will actually be a little higher than the pulse rate of the open-string note of the heaviest string on the smallest member of the family, the violin. These are the notes just below middle C on the keyboard.
Excitation mechanisms of the woodwind and brass instrument families
The excitation of woodwind and brass instruments is described in terms of fluid mechanical 'valves' that momentarily close the flow of air through the instrument. The closure causes a sharp acoustic pulse which resonates in the body of the instrument. For woodwind instruments, the valve is the reed in conjunction with the lips. For brass instruments, the source is not clearly localised within the instrument. The source of energy is the stream of air produced by the player who controls the pressure with the tension of the lips. The source of excitation is pulsatile because the mouthpiece is coupled to the tube between the mouthpiece and the bell (i.e. the body of the instrument), and the tube can only resonate at certain frequiencies. Thus, the pulses originate from the lips, but the pulse-rate is effectively determined by the length of the instrument, and the length is varied by the valves or slide to control the pulse rate of the note. In any event, these two families of instruments also produce pulse-resonance sounds in which the acoustic scale of the source controls the repetition rate of the note, and thus, the Ss component of the instrument's register. The pulsive nature of the excitation generated by these systems, and the temporal regularity of the pulse stream, mean that the dominant components of the spectrum are strictly harmonic and they are phase locked (Fletcher and Rossing 1998). Fletcher (1978) provides a mathematical basis for understanding what is referred to as mode locking in musical instruments. Detailed descriptions of the mechanisms are provided in, for example, Benade (1976), Fletcher (1978), and McIntyre et al. (1983); a brief overview is provided in van Dinther and Patterson (2006).
Summary of the role of Ss in the perception of melody and register
Comparison of the excitation mechanisms for the different instrument families shows that the mechanisms are similar, inasmuch as they all produce regular streams of sharp pulses and the pulse rate is affected in the same way by the size of the components in the source; specifically, the pulse rate decreases as the size of the components increases. At the same time, the method whereby the pulse rate is varied to produce a melody is fundamentally different: the variable that controls pulse rate in the voice is the tension of the vocal folds, and the singer increases the tension to increase the pulse rate; whereas the variable that controls pulse rate in string instruments is string length, and the musician decreases the length to increase the pulse rate. The brass and woodwind instruments are like the strings, inasmuch as the pulse rate is varied to produce a melody varying length rather than tension; they are different from the strings inasmuch as the length in this case is tube length rather than string length.
This brief overview of excitation mechanisms is intended to illustrate that, although different instrument families employ very different mechanisms to produce a regular stream of sharp pulses, and it is important for musicians to understand something of these mechanisms in order to play their instruments properly, nevertheless, all of these instruments produce pulse-resonance sounds, and the melody information in music is a sequence of pulse-rate values that specify the momentary acoustic scale of the source of excitation. Although the relationship between the physical variables involved in instrument excitation and the repetition rate of a given note is complex, the relationship between the acoustic variable, Ss, which summarizes the action of the source, and the pitch we perceive is straightforward. Indeed, the relationship is so direct that we usually use the units of acoustic scale, Hz or pulses per second, as the units for the auditory variable, pitch.
The filtering of the excitation pulses and the acoustic scale of the filter, Sf
Each of the pulses produced by the excitation mechanism of a sustained-tone instrument is filtered by body resonances within the instrument, and it is these body resonances that produce the resonances we observe following the pulses in the waveform. The resonances also produce the distinctive shape of the envelope in the frequency domain, and ultimately, the timbre of the instrument family. For stringed instruments, the prominent resonances are associated with the plates of the body (wood resonances), the body cavities (air resonances), and the bridge (Benade 1976). For brass and woodwind instruments, the prominent resonances are associated with the shape of the mouthpiece, which acts like a Helmholtz resonator, and the shape of the bell which determines the efficiency with which the spectral components radiate into the air (Benade and Lutgen 1988). Woodwind instruments have a tube resonance like brass instruments, but the filtering is complicated by the 'open-hole cutoff frequency' for woodwinds. The dominant resonances of speech sounds are determined by the shape of the vocal tract (Chiba and Kajiyama 1941; Fant 1960). So, just as there are many source mechanisms for generating the pulse stream, so there are many systems of body resonances to produce distinctive spectral envelopes.
Within a family, the most prominent distinction between the members of the family is the size of the body of the instrument, and the primary effect of instrument size on the perception of register is straightforward: If the size of an instrument is changed while keeping its shape the same, the result is a proportionate change in Sf, the acoustic scale of the filter mechanism in the body of the instrument. That is, if the three spatial dimensions of an instrument are increased by a factor, a, keeping the materials of the instrument the same, the natural resonances decrease in frequency by a factor of 1/a. The shape of the spectral envelope is preserved under this transformation, and so, if the spectral envelope is plotted on a log-frequency axis, the envelope shifts as a unit towards the origin, without changing shape, and the change in Sf will be the log of the relative size of the two instruments. This uniform scaling relationship is called 'the general law of similarity of acoustic systems' (Fletcher and Rossing 1998), and it is used to produce much of difference in Sf between instruments. Numerical examples illustrating how the spatial dimensions of an instrument affect its resonances are provided by van Dinther and Patterson (2006) for two specific forms of resonator, Helmholtz resonators and flat plates.
Comparison of the filter systems of the different instrument families shows that the spectral envelope is affected in the same way by changes in the size of the filter-system components; specifically, the resonant frequencies decrease as body size increases and so the spectral envelope shifts towards the origin as the sizes of the components increase. So size affects the filter system in the same way as it affects the excitation mechanism, simply because bigger things vibrate more slowly. The wood-plate and bridge resonances of the string-family filter system are complex, and they are fundamentally different from the bell and mouth piece resonances of the brass-family filter system, which are also complex. Despite the complexity of the relationship between the physical variables involved in body filtering and the shape of the resultant spectral envelope, the relationship between the acoustic properties and the perception of the notes is fairly straightforward; the shape of the envelope determines the sound quality, or family timbre, the acoustic scale of the filter, Sf, summarizes the filter component of the perception of register. In all of these instrument families, the register decreases from soprano to base as instrument size increases and the spectral envelope shifts toward the origin.
Acoustic scale and register range
In sections 3.1 and 3.2, the relationship between the physical variables involved in the production of musical notes, and the acoustic scale of the source and filter, was presented in theoretical terms without reference to the practicalities of constructing and playing instruments. It turns out, that it is not possible to simply scale the spatial dimensions of instruments to achieve registers ranging from soprano to base in most instrument families; the base member would be too large and the soprano member to small. This section reviews the spatial scaling problem, and describes how the instrument makers produce notes with a wide range of acoustic scale without using excessively large or small instruments. The spatial scaling problem arises from the desire to simultaneously satisfy three design criteria for families of sustained-tone instruments:
The first criterion is that instruments should produce notes which are heard to have a strong musical pitch whose clarity and salience provide for effortless communication of novel melodies. This places an important constraint on the relationship between the acoustic scale variables, Ss and Sf. The instrument's filter system must resonate at frequencies corresponding to the first eight harmonics of the pulse rate of each note that the instrument is intended to play; that is, the instrument must emit significant amounts of acoustic energy in the range from the pulse rate of each note to three octaves above that pulse rate. This is necessary because the pitch of notes where the energy is carried by harmonics above about the tenth is not sufficiently salient to support accurate perception of novel melodies (Krumbholz et al. 2000; Pressnitzer et al. 2001). The second criterion is that the members of each instrument family should, together, produce notes that cover a significant portion of the musical scale, which for the keyboard encompasses about seven-octaves from about 27.5 - 3520 Hz. When combined with the first criterion, the second criterion effectively requires that the instruments of a given family have matched Ss and Sf values for all of the registers in the range from soprano to bass. This is a very demanding constraint, particularly when combined with the third criterion, which is that the instruments should be playable and portable. This last, practical constraint places limitations on the sizes of instruments which, in turn, means that the desired range of notes cannot be achieved by simply scaling instrument size in accordance with the law of acoustic similarity.
There are problems for the instrument maker at both ends of the register range. For example, in the string family, there is a limit to how short the neck can be on the violin if the contact points for the notes are to be spaced far enough apart for a musician to play the instrument accurately and quickly. And at the other end of the range, if the instrument maker attempts to scale up the soprano version of the family to provide the bass member, the instruments become too large to play and too large to carry. Hutchins (1967, 1980) described the problems encountered when you try to construct a family of eight stringed instruments covering the entire range of orchestral registers based on the properties of the violin. The double bass member of the family would have to be six times the size of the violin, if simple scaling of instrument dimensions were to be used to provide a shift of six octaves in the spectral envelope. The length of a violin is about 0.6 m, so the double bass in this hypothetical family would have to be 3.6 m tall. The lower notes on the strings of such a double bass would not be reachable for most musicians and the instrument would not be portable. So, although instrument makers scale the dimensions of instruments to achieve much of the desired change in Ss and Sf, it is not possible to use the scaling of spatial dimensions, on its own, to provide the full range of registers in each family, and at the same time, ensure that the pitch of each note is sufficiently strong to support novel melody perception.
So, how do the instrument makers construct families of instruments that produce notes with salient pitches over the full range of registers from soprano to bass, and which are, at the same time, playable and portable? The first criterion of instrument production is immutable; the instrument must produce energy in the first three octaves of the pulse rate if the note is to have a well defined pitch. The third criterion is essential; the instruments have to be playable and portable. So how do the instrument makers provide such a wide range of notes on instruments with manageable sizes? This is where the knowledge and craft of the instrument maker come to the fore. What is required is not that the soprano instruments be excessively small and the bass instruments be excessively large; what matters is that the instruments produce notes with a wide range of Ss and Sf values, and that the Ss and Sf values are coordinated throughout the range. So what the musicians have done is find ways of extending the range of Ss and Sf, beyond what is practical with spatial-dimension scaling, by adjusting other physical properties of the instruments such as the mass of the strings, the thickness of the plates or the depth of the whole instrument. They scale the physical dimensions of the family so that the largest member is portable and the smallest member is playable, and then they adjust other physical properties of the instrument to achieve the desired acoustic scale values for the source mechanism and the filter system (e.g. Schelleng 1963).
With regard to the source of excitation in the string family, the strings on the larger members are not as long as the law of acoustic similarity would require because it would make the instruments unwieldy. The instrument makers increase the linear mass of the strings (the mass per meter) by increasing their diameter and by winding metal coils around the string. The increased linear mass causes the strings to vibrate more slowly as dictated by equation 2. The instrument makers use a change in mass to obtain the lower ranges of notes on the lower strings of any given member of the family. Comparison of the top and bottom strings on the violin will be used to illustrate the point. The highest string is tuned at a pulse rate of 660 pps (E5) while the lowest one is tuned at 196 pps (G3) -- a pulse-rate ratio of 3.4. The typical linear mass for the E5 string is about 0.4 g/m, and its tension is generally around 77 N. For the G3 string, these values are 2.4 g/m and 40 N. The pulse rates of the two strings can be calculated from equation 2, and it is found to be 3.4, as required.
With regard to the filter systems in the string family, the filter systems of the larger members of the family are not as large as the law of acoustic similarity would require, because it would make the instruments too heavy and too large. The instrument makers adapt the characteristics of the instruments to preserve the sound quality while making them usable at the same time. The main resonance is driven by the cavity mode of the body which then work as a Helmoltz resonator. The volume of the instrument as well as the surface of the f holes are then the key parameters. The open strings of the cello are tuned to pulse rates three times lower than those of the violin. However, the plates of the body are only 2.1 times larger than those of the violin (Schelleng 1963), while the rib height of the cello is about four times that of the violin (Fletcher and Rossing 1998). The volume of the cello is, then, 17 times larger than that of the violin; this is equivalent to uniform spatial scaling by a factor of 2.6. To lower the body resonances to the desired values, the instrument makers vary the mass, thickness and arching of the body plates. Specifically, the body plate of the cello is made proportionally thinner than that of the violin which lowers the body resonance frequency (Molin et al. 1988).
Finally, note that that the range of notes covered by the vocal registers, from soprano to bass, is only about four octaves in total (from about C6 down to a little over C2), while the string family covers almost seven octaves (from just under C8 to just over C1. The singing teacher can help a vocalist strengthen notes at the end of their natural range, but they cannot stretch the vocal tract length or add significant mass to the vocal folds. (As an aside, when you have a cold, the phlegm adds mass to the vocal cords which is why you can hit lower notes than normal for a few days.)
Acoustic-scale 'melodies' and the perception of pitch and timbre
The discussion of pitch and timbre in the remainder of the chapter is more readily understood when presented in terms of 'melodies' in which the acoustic scale values of the notes, Ss and Sf, vary according to the diatonic scale of Western music. These melodies and the discussion are presented on the Acoustic-scale melodies page to keep the current page under the 50k wiki limit.
Conclusions
The discovery that acoustic scale is a basic property of sound (Cohen 1993) leads to the conclusion that the major categories of timbre (vowel type and instrument family) are determined by spectral envelope shape, and that these categories of timbre are relatively independent of both the acoustic scale of the excitation source and the acoustic scale of the resonant filter. In speech, the acoustic scale variables, Ss and Sf, largely determine the voice quality of the speaker, and thus our perception of their sex and size (e.g., Smith and Patterson, 2005). With regard to timbre, this suggests that when dealing with tonal sounds that have pronounced resonances like the vowels of speech, it would be useful to distinguish between aspects of timbre associated with the shape of the spectral envelope, on the one hand, and aspects of timbre associated with the acoustic scale variables, Ss and Sf, on the other hand. This would lead to a distinction between the 'what' and 'who' of timbre, that is, what is being said, and who is saying it. This kind of distinction would at least represent progress towards a more informed use of the term timbre.
Acknowledgements
Research supported by the Medical Research Council (G0500221; G9900369)
References
- Benade AH (1976) Fundamentals of Musical Acoustics. Oxford University Press. [1] [2]
- Benade AH, Lutgen SJ (1988) The saxophone spectrum. J Acoust Soc Am 83:1900-1907. [1]
- Chiba T, Kajiyama M (1941) The vowel, its nature and structure. Tokyo-Kaiseikan Pub Co., Tokyo. [1]
- Cohen L (1993) The scale representation. IEEE Trans. Sig. Proc. 41:3275-3292. [1]
- Fant G (1960) Acoustic Theory of Speech Production. Mouton De Gruyter, The Hague. [1]
- Fitch WT, Giedd J (1999) Morphology and development of the human vocal tract: A study using magnetic resonance imaging. J Acoust Soc Am 106:1511-1522. [1]
- Fletcher NH (1978) Mode locking in nonlinearly excited inharmonic musical oscillators. J Acoust Soc Am 64:1566-1569. [1] [2]
- Fletcher NH, Rossing TD (1998) The Physics of Musical Instruments. Springer, New-York. [1] [2] [3]
- Hutchins CM (1967) Founding a family of fiddles. Phys Today 20:23-37. [1] [2]
- Hutchins CM (1980) The new violin family. In: Benade AH (ed), Sound Generation in Winds, Strings, Computers. The Royal Swedish Academy of Music, pp.182-203. [1]
- Krumbholz K, Patterson RD, Pressnitzer D (2000) The lower limit of pitch as determined by rate discrimination. J Acoust Soc Am 108:1170-1180. [1]
- Lee S, Potamianos A, Narayanan S (1999) Acoustics of children's speech: developmental changes of temporal and spectral parameters. J Acoust Soc Am 105:1455-68. [1] [2]
- McIntyre ME, Schumacher RT, Woodhouse J (1983) On the oscillations of musical instruments. J Acoust Soc Am 74:1325-1345. [1]
- Molin NE, Lindgren L-E, Jansson EV (1988) Parameters of violin plates and their influence on the plate modes. J Acoust Soc Am 83:281-291. [1]
- Peterson GE, Barney HL (1952) Control Methods Used in a Study of the Vowels. J Acoust Soc Am 24:175-184. [1]
- Pressnitzer D, Patterson RD, Krumbholtz K (2001) The lower limit of melodic pitch. J Acoust Soc Am 109:2074-2084. [1]
- Schelleng JC (1963) The Violin as a Circuit. J Acoust Soc Am 35:326-338. [1] [2]
- Turner RE, Walters TC, Monaghan JJ, Patterson RD (2009) A statistical, formant-pattern model for segregating vowel type and vocal-tract length in developmental formant data. J Acoust Soc Am 125:2374-2386. [1] [2] [3] [4]
- van Dinther R, Patterson RD (2006) Perception of acoustic scale and size in musical instrument sounds. J Acoust Soc Am 120:2158-76. [1] [2] [3]