Etablissement
Ecole Nationale Supérieure d'informatique
Affiliation
Département de Post-Graduation
Auteur
CHELLALI, Sara
Directeur de thèse
Ahfir Maamar (Maitre de conférence)
Co-directeur
Hidouci Khaled Walid (Maitre de conférence)
Filière
Informatique
Diplôme
Doctorat
Titre
Development of a Robust ASR to Reverberation
Mots clés
Room Acoustics - Automatic Speech Recognition - Reverberation and Dereverberation - Acoustic/Speech Signal Processing
Résumé
Reverberation is known to be very damaging to the performance of automatic speech recognition (ASR), because information from a frame of Data is spread into subsequent frames in a distorting manner. The growing use of Hands-free mode in smart phones and Skype-type terminals for examples raises the importance of robustness of ASR to reverberation. Reverberation phenomenon is caused by the effect of the room where the useful sound source is placed. This is because of acoustic wave reflections. Reverberation is very sensitive in communications with Hands-free systems and in general, when a sound recording is carried out in an enclosure without a particular acoustical treatment. When reverberation is relatively significant, the source seems to be subjectively far away to the listener and then speech intelligibility is degraded (barrel-like quality sound). Dereverberation is to make the desired signal completely or partially clean of the effect of the room. A perfect dereverberation would leave only the wave coming directly from the source and would produce the same effect than a sound recording taken in an anechoic chamber. This is not required in room acoustics, because a moderate room effect can make speech sounding naturally. Current automatic speech recognition systems almost universally assume noiseless speech input and have been trained with, and expect the incoming signal to be free of any ambient and interfering noise. Typical solutions to achieve this have been requiring the speaker to be in an acoustically treated room, restricting ambient noise, or using special microphones. There has been great interest in using a desktop microphone as the input device to the recognition system instead of the unnatural headset microphone. Unfortunately, by making the input device more natural for the user, the words/phonemes recognition rate of the system can fall dramatically. Most of the degradation is thought to be caused by the acoustical characteristics of the recording environment, particularly the reverberation. For an ASR where the training Data-set have been carried out in an anechoic chamber, perfect dereverberation is required in order to remove completely reverberation from the captured speech signal by a microphone, before its input to the automatic recognition system. This problem can be found in many ASR applications, where the recording environment is part of the overall communication chain. The main goal of this project is to develop a robust ASR to reverberation. The objectives of this research project are: - Research and evaluation on recently developed ASR for French language. - Novel algorithm must be proposed for French phonemes recognition, and then continuous French speech recognition with mono-speaker/multi-speaker. - Robustness testing of the novel developed algorithm /ASR to reverberation. - Dereverberation techniques/algorithms may be employed in order to improve the performance of the developed ASR. - Results must be validated in the context of a specific realistic application
Statut
Vérifié