TÁVKÖZLÉSI ÉS MÉDIAINFORMATIKAI TANSZÉK
Budapesti Műszaki és Gazdaságtudományi Egyetem - Villamosmérnöki és Informatikai Kar

Témák listája

Speaker Adaptation Based deep neural network - Text to Speech Synthesis
Speech processing has attracted the interest of both scholars and industry during the last few decades. The technique of converting text into artificial speech is known as speech synthesis. It can be utilized in a blind person's speech monitoring system, a web browser, mobile phones, PCs, and laptops. Nowadays, every effort is taken to generate as natural a synthesized sound as possible. Our project aims to create a speaker adaption model that uses a Deep Neural Network to synthesize speech. The project will be completed using Merlin (a speech synthesis toolkit that uses neural networks to create speech).
Témavezető: Mandeel Ali Raheem
Speaker Adaptation based Text to Speech Synthesis
Speech is the most natural mode of communication. Speech synthesis is converting the text to speech like a human. One challenge of modeling this process is the lack of data resources. Speaker adaptation is one of these solutions. With the speaker adaptation, we train a model with big data and then adapt it to a limited target speaker. Speaker adaptation could also be beneficial with speech communication for the speech-impaired. The student is asked to develop/modify a model to do a speaker adaptation method. Basic programming knowledge is necessary, and machine learning / deep learning experience is beneficial. For BSc/MSc students
Témavezető: Mandeel Ali Raheem