Technologies for Human-Machine Interaction

Robust voice modeling and processing

Acoustic modelling is one of the most critical elements in speech processing applications. Our research efforts in this topic focus on the study of language-independent acoustic units and robust acoustic feature selection. We are also investigating the use of deep bayesian networks to improve acoustic modelling, providing an uncertainty measure in the decoding process. In general terms, our work aims to allow an effective adaptation to different speakers and different acoustic environments.

Automatic speech recognition

Also commonly referred to as speech to text, automatic speech recognition (ASR) is a key enabling technology for most human-machine interaction applications. Since its beginnings, our research group has always considered ASR technology as a significant research line, with a special focus on applying ASR under adverse acoustic scenarios.

In this field, our current research efforts focus on the exploration of novel acoustic modelling structures based on deep neural networks and end-to-end architectures. At the same time, we are also evaluating the use of wide residual networks both in acoustic modelling for ASR and other related subtasks such as automatic punctuation. Furthermore, in order to mitigate the need of huge amounts of data, we are also investigating the use of unsupervised and semi-supervised learning approaches, inferring new features directly from the raw waveform.

Natural language processing

Natural language processing (NLP) is a branch of language technologies that aims to provide computers with the ability to understand text in a similar way as humans could do. NLP comprises several different applications, our research group has specially focused on the natural language generation task, whose aim is to generate text given a set of characteristics inferred from an external corpus. Through the years we have seen different technologies providing competitive results in this task, with n-gram models being used untill the emergence of deep neural networks in language modelling.

Another interesting applications of NLP is the spoken dialogue systems. Our research group holds a large experience in this research line, with several systems developed in collaborations with different spanish research group and with different competitive projects granted in this topic.

Ir arriba