One speaks with smartphones or call services on a daily basis. Researchers at the University of Passau decisively promote a technology of enhancing automatic speech and speaker recognition by adding social factors.
Recently, automatic speech and speaker recognition has matured to the degree that it entered the daily lives of thousands of Europe's citizens, e.g., on their smart phones or in call services. During the next years, speech processing technology will move to a new level of social awareness to make interaction more intuitive, speech retrieval more efficient, and lend additional competence to computer-mediated communication and speech-analysis services in the commercial, health, security, and further sectors.
Evolving analysis of speaker characteristics
To reach this goal, rich speaker traits and states such as age, height, personality and physical and mental state as carried by the tone of the voice and the spoken words must be reliably identified by machines. In the iHEARu project, ground-breaking methodology including novel techniques for multi-task and semi-supervised learning will deliver for the first time intelligent holistic and evolving analysis in real-life condition of universal speaker characteristics which have been considered only in isolation so far.
Today's sparseness of annotated realistic speech data will be overcome by large-scale speech and meta-data mining from public sources such as social media, crowd-sourcing for labelling and quality control, and shared semi-automatic annotation. All stages from pre-processing and feature extraction, to the statistical modelling will evolve in "life-long learning" according to new data, by utilising feedback, deep, and evolutionary learning methods. Human-in-the-loop system validation and novel perception studies will analyse the self-organising systems and the relation of automatic signal processing to human interpretation in a previously unseen variety of speaker classification tasks. The project's work plan gives the unique opportunity to transfer current world-leading expertise in this field into a new de-facto standard of speaker characterization methods and open-source tools ready for tomorrow's challenge of socially aware speech analysis.
Participants and funding
Prof. Dr. Björn Schuller, head of the Chair for Complex and Intelligent Systems, leads the project. The University of Passau collaborates with the Technical University of Munich. The project receives research funding from the European Union under the 7th Framework Programme of the European Research Council by one of the prestigious "Starting Grants" - the highest grant for the promotion of young talents.
Principal Investigator(s) at the University | Prof. Dr. Björn Schuller (Lehrstuhl für Complex and Intelligent Systems) |
---|---|
Project period | 01.01.2014 - 31.12.2018 |
Website | http://www.ihearu.eu/ |
Source of funding |
Europäische Union (EU) > EU - 7. Forschungsrahmenprogramm (7. FRP) > EU - 7. FRP - European Research Council (ERC) > EU - 7. FRP - ERC - Starting Grant
|
Projektnummer | 338164 |