Speaker verification; Machine learning; Biometrics; Anti-spoofing; Speaker Recognition
Muckenhirn Hannah, Abrol Vinayak, Magimai-Doss Mathew, Marcel Sébastien (2019), Understanding and Visualizing Raw Waveform-Based CNNs, in Interspeech 2019
, Graz, AustriaInternational Speech Communication Association, International Speech Communication Association Archive.
Kabil Selen Hande, Muckenhirn Hannah, Magimai.-Doss Mathew (2018), On Learning to Identify Genders from Raw Speech Signal Using CNNs, in Interspeech 2018
, International Speech Communication Association, International Speech Communication Association Archive.
Muckenhirn Hannah, Magimai.-Doss Mathew, Marcel Sebastien (2018), On Learning Vocal Tract System Related Speaker Discriminative Information from Raw Signal Using CNNs, in Interspeech 2018
, Hyderabad, IndiaISCA, ISCA.
Muckenhirn Hannah, Magimai.-Doss Mathew, Marcel Sébastien (2018), Towards directly modeling raw speech signal for speaker verification using CNNs, in IEEE International Conference on Acoustics, Speech and Signal Processing
, IEEE, IEEEXplorer.
Muckenhirn Hannah, Magimai.-Doss Mathew, Marcel Sébastien (2017), End-to-End Convolutional Neural Network-based Voice Presentation Attack Detection, in Proceedings of International Joint Conference on Biometrics
, IEEE, IEEEXplore.
Muckenhirn Hannah, Korshunov Pavel, Magimai.-Doss Mathew, Marcel Sébastien (2017), Long-Term Spectral Statistics for Voice Presentation Attack Detection, in IEEE/ACM Transactions on Audio, Speech and Language Processing
, 25(11), 2098-2111.
Korshunov Pavel, Marcel Sébastien, Muckenhirn Hannah, Gonçalves A. R., Mello A. G. Souza, Violato R. P. Velloso, Simões Flávio, Uliani Neto Mário, de Assis Angeloni Marcus, Stuchi J. A., Dinkel H, Chen N, Qian Yanmin, Paul D, Saha G, Sahidullah Md (2016), Overview of BTAS 2016 Speaker Anti-spoofing Competition, in IEEE International Conference on Biometrics: Theory, Applications and Systems
, IEEE, IEEEXplore.
Muckenhirn Hannah, Magimai.-Doss Mathew, Marcel Sébastian (2016), Presentation Attack Detection Using Long-Term Spectral Statistics for Trustworthy Speaker Verification, in International Conference of the Biometrics Special Interest Group (BIOSIG)
, IEEE, IEEEXplore.
The goal of automatic speaker recognition task is to recognize persons through their voice. Automatic speaker verification is a subtest of speaker recognition task where the goal is to verify or authenticate a person. State-of-the-art speaker verification systems typically model short-term spectrum based features such as mel frequency cepstral coefficients (MFCCs) through a generative model such as, Gaussian mixture models (GMMs) and employ a series of compensation methods to achieve low error rates. This has two main limitations. First, the approach necessitates availability of sufficient training data for each speaker for robust modeling and sufficient test data to apply the series of compensation techniques to verify a speaker. Second, the speaker verification system is prone to malicious attacks such as through voice conversion (VC) system, text-to-speech (TTS) system. The main reason is that the front-end feature and back-end models of speaker verification system, namely, MFCC and GMMs, are similar to that of VC system and TTS system.The proposed project aims to address these limitations through development of novel approaches for trustworthy speaker verification. In order to achieve that, through collaboration between researchers from Speech and Audio Processing group and Biometrics group at Idiap, the proposed project focuses along two lines,1. in on-going DeepSTD project funded by HASLER foundation, in the context of speech recognition, it was shown that speech recognition systems can be built by directly modeling raw speech signals using artificial neural networks. The proposed project aims to build on that approach to develop a generic speaker verification approach that can be used for both speaker verification and speaker diarization.2. in a collaborative study with researchers from Univesity of Eastern Finland and Nanyang Technical University (Singapore), Idiap have developed a countermeasure approach for state-of-the-art speaker verification system. The proposed project aims to extend this approach along with development of novel anti-spoofing countermeasures using binary features and text-dependent speaker verification.