Project

Back to overview

Acoustic Characteristics of Voice in Music and Straight Theatre, and Related Aspects of Production and Perception

English title Acoustic Characteristics of Voice in Music and Straight Theatre, and Related Aspects of Production and Perception
Applicant Maurer Dieter
Number 159350
Funding scheme Project funding (Div. I-III)
Research institution Institute for the Performing Arts and Film Zürcher Hochschule der Künste
Institution of higher education Zurich University of the Arts - ZHdK
Main discipline Music, Theatre
Start/End 01.09.2015 - 31.07.2018
Approved amount 367'978.22
Show all

Keywords (9)

Vowel; singer's formant cluster; Formants; Voice; actor's/speaker's formant cluster; Formant tuning; Acting; F0; Singing

Lay Summary (German)

Lead
Die Ergebnisse eines vorgängigen SNF Projekts, in welchem stimmliche und sprachliche Charakteristika von ausgebildeten SängerInnen und SchauspielerInnen mit denjenigen von unausgebildeten SprecherInnen verglichen wurden, verlangen nach einer weiterführenden Untersuchung von Aspekten, wie sie im vorliegenden Projekt angegangen wird: (i) Textverständlichkeit auf hohen Grundtonhöhen, (ii) Abhängigkeit des stimm- und sprachspezifischen Spektrums < 1.5kHz von der Grundtonhöhe, und damit verbunden Fälle von "formant pattern ambiguity", (iii) akustischer Effekt stilspezifischer spektraler Anteile > 1.5kHz auf die Stimmprojektion, (iv) alters- und genderspezifische spektrale Unterschiede stimmlicher und sprachlicher Äusserungen, (v) Beziehung zwischen der Resonanzwirkung des Vokaltrakts und dem Spektrum des erzeugten Stimmklangs bei ausgeprägter Variation der Grundtonhöhe.
Lay summary

Die Ergebnisse eines vorgängigen SNF Projekts, in welchem stimmliche und sprachliche Charakteristika von ausgebildeten SängerInnen und Schauspielerinnen mit denjenigen von unausgebildeten SprecherInnen verglichen wurden, verlangen nach einer weiterführenden Untersuchung von Aspekten, wie sie im vorliegenden Projekt angegangen wird:

(1) Zur Klärung der Verständlichkeit von Sprache auf hohen Grundtonhöhen F0 werden Aufnahmen von KomikerInnen und "voice-over" SprecherInnen vorgenommen.

(2) Zur Klärung der beobachteten Beziehung zwischen F0 und sprachspezifischem Spektrum < 1.5kHz (vgl. "formant tuning") werden Experimente mit Resynthesen, Synthesen und Hoch- und Tiefpassfilterungen durchgeführt.

(3) Zur Klärung des akustischen Effekts erhöhter Energie > 1.5kHz und entsprechender stilspezifischer Ausformungen (vgl. "singer’s/actor’s formant cluster") werden Aufnahmen von SängerInnen und SchauspielerInnen auf der Bühne vorgenommen, unter Berücksichtigung verschiedener Distanzen zwischen Bühne und Mikrofon.

(4) Zur Klärung des Einflusses der Beziehung zwischen F0 und sprachspezifischem Spektrum < 1.5kHz auf alters- und genderspezifische spektrale Merkmale werden Experimente mit Resynthesen und Synthesen durchgeführt.

(5) Zur Klärung der Beziehung zwischen F0, sprachspezifischem Spektrum < 1.5kHz und Artikulation werden Resonanzmessungen des Vokaltrakts während stimmlicher und sprachlicher Produktionen vorgenommen.

Alle natürlich produzierten, manipulierten oder synthetisierten Aufnahmen werden Wahrnehmungstests unterzogen.

Bedeutung: Zusammen mit den im vorgängigen Projekt vorgenommenen Aufnahmen und durchgeführten Untersuchungen wird eine bis anhin nicht bestehende empirische Basis in der Form einer Datenbank von über 35'000 Aufnahmen ausgebildeter und unausgebildeter Stimmen zur Verfügung stehen, welche differenzierten Einblick in die Akustik von Bühnenstimmen wie von stimmlichen und sprachlichen Äusserungen ganz allgemein geben wird.

Direct link to Lay Summary Last update: 04.08.2015

Lay Summary (English)

Lead
In a preceding project, major style-specific aspects of the acoustics of stage voices have been re-examined. On the basis of the corresponding findings, further clarifications are needed which are addressed by the present follow-up project: (i) phoneme and text intelligibility on high pitches, (ii) dependence of lower spectral characteristics on fundamental frequency F0 and formant pattern ambiguity, (iii) the acoustic effect of style-specific higher spectral characteristics on voice projection, (iv) age and gender-related spectral differences, (v) the relationship between vocal tract resonances and spectral characteristics of sounds produced in different styles and at different levels of F0.
Lay summary
In a preceding SNSF project, major style-specific aspects of the acoustics of stage voices have been re-examined. On the basis of the corresponding findings, some aspects of stage-specific as well as of general voice production and perception need further investigation within the present project:

(1) To clarify phoneme and text intelligibility on high pitches, recordings of comic and voice-over actresses/actors and related listening tests are conducted.

(2) To clarify the observed F0-dependence of lower spectral characteristics (of importance for the concept of formant tuning), resynthesis, synthesis, and low and high-pass filtering experiments are conducted including corresponding listening tests.

(3) To clarify the acoustic effect of increased intensity in mid and high spectral ranges and the shaping of the spectral envelope > 2.5kHz, recordings live on stage are conducted including corresponding listening tests.

(4) To clarify the observed F0-dependence of lower spectral characteristics and their impact on expected age and gender-related spectral differences, resynthesis experiments and related listening tests are conducted.

(5) To clarify the relation between F0-dependent changes of vowel-specific spectral characteristics and style-specific spectral characteristics on the one hand, and articulation on the other, a parallel investigation of sound production and measurement of vocal tract resonances is conducted.

Relevance: The present project aims at contributing to the interdisciplinary research on the fundamentals and the aesthetics of the voice. As a result of the previous and the actual project, we will provide an empirical basis in the form of a sound database including more than 35’000 single recordings of trained and untrained voices–unique in its systemacy and extension–which will allow for in-depth insight into the acoustics of stage voices and into general aspects of voice and vowel acoustics.

Direct link to Lay Summary Last update: 04.08.2015

Responsible applicant and co-applicants

Employees

Publications

Publication
Influences of Fundamental Oscillation on Speaker Identification in Vocalic Utterances by Humans and Computers
Dellwo Volker, Kathiresan Thayabaran, Pellegrino Elisa, He Lei, Schwab Sandra, Maurer Dieter (2018), Influences of Fundamental Oscillation on Speaker Identification in Vocalic Utterances by Humans and Computers, in Interspeech 2018, HyderabadInternational Speech Communication Association (ISCA), Hyderabard, India.
Formant pattern and spectral shape ambiguity in vowel synthesis: The role of fundamental frequency and formant amplitude
Kathiresan Thayabaran, Maurer Dieter, Suter Heidy, Dellwo Volker (2018), Formant pattern and spectral shape ambiguity in vowel synthesis: The role of fundamental frequency and formant amplitude, in The Journal of the Acoustical Society of America, Minneapolis 143(3), 1919-1920, ASA, Melville, NY 143(3), 1919-1920.
Sinewave vowel sounds: The role of vowel qualities, frequencies and harmonicity of sinusoids, and perceived pitch for vowel recognition
Maurer Dieter, Suter Heidy, Kathiresan Thayabaran, Dellwo Volker (2018), Sinewave vowel sounds: The role of vowel qualities, frequencies and harmonicity of sinusoids, and perceived pitch for vowel recognition, in The Journal of the Acoustical Society of America, Minneapolis 143(3), 1920-1920, ASA, Melville, NY 143(3), 1920-1920.
The Zurich Corpus of Vowel and Voice Quality, Version 1.0
Maurer Dieter, d'Heureuse Christian, Suter Heidy, Dellwo Volker, Friedrichs Daniel, Kathiresan Thayabaran (2018), The Zurich Corpus of Vowel and Voice Quality, Version 1.0, in Interspeech, Hyderabad1417-1421, International Speech Communication Association (ISCA), Hyderabard, India1417-1421.
Why a phenomenology of vowel sounds is needed.
MaurerDieter (2018), Why a phenomenology of vowel sounds is needed., in Proceedings of the Conference on Phonetics & Phonology in German-speaking countries , BerlinHumboldt Universität Berlin, Berlin.
“Flat” vowel spectra revisited in vowel synthesis
Maurer Dieter, Suter Heidy (2017), “Flat” vowel spectra revisited in vowel synthesis, in The Journal of the Acoustical Society of America, Boston 141(5), 3469-3469, ASA, Melville, NY 141(5), 3469-3469.
Formant pattern ambiguity of vowel sounds revisited in synthesis: Changing perceptual vowel quality by only changing fundamental frequency
Maurer Dieter, Dellwo Volker, Suter Heidy, Kathiresan Thayabaran (2017), Formant pattern ambiguity of vowel sounds revisited in synthesis: Changing perceptual vowel quality by only changing fundamental frequency, in The Journal of the Acoustical Society of America, Boston 141(5), 3469-3470, ASA, Melville, NY 141(5), 3469-3470.
Vowel synthesis related to equal-amplitude harmonic series in frequency ranges > 1 kHz combined with single harmonics < 1 kHz, and including variation of fundamental frequency
Maurer Dieter, Suter Heidy (2017), Vowel synthesis related to equal-amplitude harmonic series in frequency ranges > 1 kHz combined with single harmonics < 1 kHz, and including variation of fundamental frequency, in The Journal of the Acoustical Society of America, Boston 141(5), 3469-3469, ASA, Melville, NY 141(5), 3469-3469.
Enhancing the objectivity of interactive formant estimation: introducing euclidean distance measure and numerical conditions for numbers and frequency ranges of formants
Kathiresan Thayabaran, Maurer Dieter, Dellwo Volker (2017), Enhancing the objectivity of interactive formant estimation: introducing euclidean distance measure and numerical conditions for numbers and frequency ranges of formants, in Elektronische Sprachsignalverarbeitung 2017, (86), 130-137, TUDpress, Saarbrücken(86), 130-137.
Vowel recognition at fundamental frequencies up to 1 kHz reveals point vowels as acoustic landmarks
Friedrichs Daniel, Maurer Dieter, Rosen Stuart, Dellwo Volker (2017), Vowel recognition at fundamental frequencies up to 1 kHz reveals point vowels as acoustic landmarks, in Journal of the Acoustical Society of America, 142(2), 1025-1033.
Automatic selection of the number of poles for different gender and age groups in steady-state isolated vowels
Kathiresan Thayabaran, Maurer Dieter, Dellwo Volker (2016), Automatic selection of the number of poles for different gender and age groups in steady-state isolated vowels, in The Journal of the Acoustical Society of America, Honolulu 140(4), 3058-3058, ASA, Melville, NY 140(4), 3058-3058.
How listeners recognise vowel sounds under highpass or lowpass filtering of vowel-specific frequency ranges
Maurer Dieter, Kathiresan Thayabaran, Suter Heidy, Dellwo Volker (2016), How listeners recognise vowel sounds under highpass or lowpass filtering of vowel-specific frequency ranges, in The Journal of the Acoustical Society of America, Honolulu 140(4), 3217-3217, ASA, Melville, NY 140(4), 3217-3217.
Mapping vowel categories at high fundamental frequencies using multidimensional scaling of cochlea-scaled spectra
Friedrichs Daniel, Rosen Stuart, Iverson Paul, Maurer Dieter, Dellwo Volker (2016), Mapping vowel categories at high fundamental frequencies using multidimensional scaling of cochlea-scaled spectra, in The Journal of the Acoustical Society of America, Honolulu 140(4), 3219-3219, ASA, Melville, NY 140(4), 3219-3219.
Acoustics of the Vowel
Maurer Dieter (2016), Acoustics of the Vowel, Peter Lang AG, Bern.
Vowel sounds produced with varying production parameters: Conceptualisation and realisation of a database
MaurerDieter, KathiresanThayabaran, SuterHeidy, DellwoVolker (2016), Vowel sounds produced with varying production parameters: Conceptualisation and realisation of a database, in 25th Annual Conference of the Internatioal Association for Forensic Phonetics and Acoustics, YorkUniversity of York, York.

Collaboration

Group / person Country
Types of collaboration
Otto Falckenberg Schule, Fachakademie für Darstellende Kunst, Munich, Andreas Sippel Germany (Europe)
- in-depth/constructive exchanges on approaches, methods or results

Associated projects

Number Title Start Funding scheme
185399 The dynamics of indexical information in speech and its role in speech communication and speaker recognition 01.12.2019 Project funding (Div. I-III)
143943 Akustische Eigenschaften der Stimme im Musik- und Sprechtheater: Aufbau systematischer empirischer Grundlagen 01.03.2013 Project funding (Div. I-III)
183152 "Voice Theft": Chances and risks of digital voice technology 01.12.2018 Digital Lives

Abstract

Ongoing project and first results: In an ongoing 2-year SNSF project, major style-specific aspects of the acoustics of stage voices are re-examined: extensive pitch variation and formant tuning, increased intensity in mid and high spectral ranges, singer’s or speaker’s/actor’s formant cluster (hereafter SF or SPF). Extensive and systematic recor-dings include utterances of 42 professional singers, actresses/actors, and non-professional speakers. Sounds of the long German vowels /i, y, e, ø, ?, a, o, u/ in V, sVsV, and minimal pair condition are investigated with varying basic production parameters such as pitch, vocal effort, production style, and phonation type. Additional references for “average” conversational speech are also created. Acoustic analysis (single sounds) and statistical analysis (sound samples) are conducted. Perceptual vowel quality is investigated in listening tests. In total, a corpus of more than 30'000 utterances of 42 speakers is created in order to build up an empirical reference on the matter.Up to now, our analyses provide strong indications for the following: (1) Possible vowel discrimination on high pitches up to F0 of c. 880Hz. Neither spectral undersampling nor “oversinging” statistical F1 is found to directly impair vowel discrimination. (2) Depending on vowels and F0 ranges, lower spectral peaks and, if determinable, related formants < 1.5kHz shift with rising F0. These shifts are found for both professionals and non-professionals, and a substantial part of them are indicated to maintaining the perceived vowel quality. (3) Speaker group differences in the vowel spectra < 1.5kHz decrease or disappear when comparing sounds at similar F0 levels. (4) Visual inspection of the spectra in general confirm reported findings in the literature of increased intensity in mid and high spectral ranges and SF for professionals. However, methods of acoustic analysis need further investiga-tion. Moreover, the effect of these acoustic characteristics on vowel perception is also in question. (5) No robust evidence for SPF was found, which may be due to the recording procedure applied.Need for clarification: On the basis of these indications, major aspects of stage-specific as well as of general voice production and perception have to be addressed: Firstly, for a clarification of phoneme and text intelligibility on high pitches, additional recordings of comic and voice-over actresses/actors are needed. Secondly, clarification of the observed F0-dependence of lower spectral characteristics (of importance for the concept of formant tuning) requires resynthesis and high-pass filtering experiments. The same holds true for the clarification of speaker group differences. Thirdly, the relation between F0-dependent changes of vowel-specific spectral characteristics and articulation should be clarified requiring a parallel investigation of sound production and imaging of the vocal tract or measurement of its resonances, respectively. Fourthly, the relation between F0-dependent changes of vowel-specific spectral characteristics and cortical representation should also be clarified. Fifthly, the acoustic effect of the levels of spectral intensity of different frequency bands and the shaping of the spectral envelope > 2.5kHz needs to be clarified requiring additional recordings live on stage, including additional investigation of SPF.The present follow-up project addresses these issues within the following experiments: Additional recor-dings (in the studio and live on stage in a large concert hall); vowel synthesis (related to statistical formant patterns) and resynthesis (related to natural vocalisations) with extensive F0 variation, testing constancy/alteration of vowel perception; vocal tract investigation (with MRI, with vocal tract resonance measurement), testing the parallelism between vowel-specific spectral characteristics of sounds produced on very different F0 and articulation; investiga-tion of cortical vowel representation (with EEG), testing constancy/alteration of the representation for vowel sounds on very different F0.-Listening tests are included in all experiments.-Great effort is also made for an extensive open access publication (entire sound database and eBook).-Improvement of the methods of acoustic analysis for sounds on high pitches is attempted for in the ongoing project.The empirical basis provided-unique in its systemacy and extension-and the clarifications attempted will allow for in-depth insight into the acoustics of stage voices, into general aspects of vowel acoustics, and into some important and related aspects of production, perception, and cortical processes, indispensable for further research and for an assessment of the relevance of acoustical descriptions for vocal education and the use of technical aids on stage.
-