What makes a voice unique?
A new digital speech database which captures how voices vary between different speakers or situations for the purpose of forensic speaker comparison has been launched by researchers at the University of Cambridge.
Motivated primarily by the need within forensic phonetics for data on how voices vary in the population, the Dynamic Variability in Speech (DyViS) database was collected and annotated by researchers in the Department of Theoretical and Applied Linguistics.
The task of forensic speaker comparison tends to be constrained by a lack of information on how given speech parameters, whether of an acoustic or a phonetic nature, vary within a population. Existing speech databases often fail to control for accent variation, whereas the relevant population for a given forensic examination will usually be speakers from a homogeneous speech community.
The DyVis database contains speech from 100 male speakers carefully selected to have the same variety of English, thus providing a research resource for studies of personal voice characteristics without differences of dialect and accent.
The participants were speakers with what is often called Standard Southern British English (SSBE) pronunciation, meaning that the database also provides a wealth of material for those wishing to study or teach the contemporary equivalent of British Received Pronunciation.
Speakers were recorded at studio quality doing four different tasks, two involving spontaneous speech and two read speech: undergoing a mock police interview, holding a telephone conversation with an alleged co-conspirator (recorded simultaneously over the public telephone network), reading a passage, and reading sentences.
Twenty of the speakers additionally provided a second recording, separated by three months, of the reading tasks, so that variation within a single voice can be studied.
The DyViS database will allow relevant population statistics to be determined for the speakers in the study, and to be extrapolated to the wider population of similar speakers; and, in the case of acoustic parameters resulting from physical variation in speakers, to be inferred for speakers of other accents.
Beta versions of the database have already been used for studies as diverse as describing the vowels of contemporary SSBE, determining the speaker-specificity of non-linguistic oral ‘clicks’ and of hesitation sounds in speaking, extracting statistics on the distribution of speaking pitch, and testing the performance of an automatic speaker identification system.
“The DyViS database will be of interest to researchers involved in forensic phonetics, speech technology (especially but not exclusively speaker identification and verification), and the analysis of spoken dialogue,”’ said Dr Charlanne Ward of Cambridge Enterprise, the University’s commercialisation arm. “It will also be useful to anyone needing an extensive repository of spoken British English, broadly of the variety found in textbooks for foreign learners but reflecting current developments in pronunciation.”
The database is available via the Economic and Social Data Service. It is free to researchers in academic institutions via the UK Access Management Federation, and under licence for research purposes to commercial organisations after registration with the UK Data Archive.
Full details of how to register can be found by accessing the database at the above link, clicking on “Download/Order” and then following the “how to register” link. Commercial users will be required to provide details of the intended use of the data, during the order process. ESDS will then contact Cambridge Enterprise to approve the user’s order, and provide a commercial licence agreement for signature. The data will be made available for download by the commercial user when the signed licence has been received.
The research of which the DyViS database forms a part was primarily funded by the Economic and Social Research Council, with assistance from BT.
Photo credit: Microphones, by Rusty Sheriff via flickr.