The human voice is the product of a complex biological process that is influenced by myriad factors, including the composition and geometry of the human vocal apparatus, physiological processes in the body, mental processes, muscular agility and inertia etc. Due to the enormous number of parameters that play a role in creating the human voice signal, no two voices in the world are alike. Voice can potentially be just as revealing of an individual’s identity as the person’s DNA or fingerprints. In fact, more revealing, since it carries signatures of the speaker’s state and surroundings at the time of speaking. This research casts the problem of reconstructing or profiling humans from voice as an independent focus area that comprises the concurrent deduction of physical, physiological, behavioral, demographic, environmental and sociological facts about humans from their voice. The scientific challenges that must be addressed in this process extend far beyond the isolated capabilities of state-of-art technologies in all disciplines that have studied human voice. Yet, as an ensemble, these disciplines provide a solid foundation for this new area of research. There is wealth of information dispersed across the (over) 30 scientific disciplines that have studied various aspects of human voice. The human voice has been variously linked to the speaker’s age, ethnicity, height, weight, psychological and physiological states, genetics (and by corollary the speaker’s bone structure, skin-color, eye-color etc.), and more. In addition it has been shown to reflect the speaker’s surroundings, origins and history in multiple small-scale studies. Nevertheless, there is negligible research on how these signatures may be measured or extracted. Additionally, in today’s world, voice is largely available over transmission channels and captured using different mechanisms. Their interplay with signatures embedded in voice has to be understood. This research collates all of these issues and uses the power of artificial intelligence and modern statistical and rule-based analytical techniques of automated discovery to address them.
This project falls within the technical area of Artificial Intelligence (AI) applied to Voice Forensics. Within this, it represents a specialized sub-area of our broader effort on Profiling humans from their voice. The project will focus on building technologies and systems for deducing person- and location specific descriptive information from voice recordings. These recordings may be from a wide variety of sources, such as hoax calls, wiretaps, telephone conversations, and voice evidence from a multitude of crimes.
Very specifically, we expect to make significant advances towards the following goals:
- Next generation techniques for micro-feature discovery
- Detecting maritime sound emitting objects in Mayday call recordings
- Building an alpha/beta version of a web-based interface for profiling
- Advances in technology for profiling from disguised voices
- Extension of the work to wiretaps and interviews
