TÁVKÖZLÉSI ÉS MÉDIAINFORMATIKAI TANSZÉK
Budapesti Műszaki és Gazdaságtudományi Egyetem - Villamosmérnöki és Informatikai Kar

Témák listája

Conversational AI alkalmazások
Valódi, hanggal beszélgető robot (virtuális ügynök) kialakítása a cél, melyhez az NVIDIA NeMo/RIVA toolkiteket használjuk. Magyar nyelven elsőként valósulhat meg a projekt. Python programozási ismeretek, mélytanulási alapok előnyt jelentenek.
Témavezető: Dr. Mihajlik Péter
Speaker Adaptation Based deep neural network - Text to Speech Synthesis
Speech processing has attracted the interest of both scholars and industry during the last few decades. The technique of converting text into artificial speech is known as speech synthesis. It can be utilized in a blind person's speech monitoring system, a web browser, mobile phones, PCs, and laptops. Nowadays, every effort is taken to generate as natural a synthesized sound as possible. Our project aims to create a speaker adaption model that uses a Deep Neural Network to synthesize speech. The project will be completed using Merlin (a speech synthesis toolkit that uses neural networks to create speech).
Témavezető: Mandeel Ali Raheem
Voice Conversion Technology and its Application with Emotional Speech
Speech is the most used and natural way for people to communicate. The goal of a VC system is to determine a transformation that makes the source speaker's speech sound as if the target speaker uttered it. This project aims to present a rule-based voice conversion system for emotion capable of converting neutral speech to emotional speech (i.e., angry, fear, happy, sad, surprise, etc.).
Speaker Adaptation based Text to Speech Synthesis
Speech is the most natural mode of communication. Speech synthesis is converting the text to speech like a human. One challenge of modeling this process is the lack of data resources. Speaker adaptation is one of these solutions. With the speaker adaptation, we train a model with big data and then adapt it to a limited target speaker. Speaker adaptation could also be beneficial with speech communication for the speech-impaired. The student is asked to develop/modify a model to do a speaker adaptation method. Basic programming knowledge is necessary, and machine learning / deep learning experience is beneficial. For BSc/MSc students
Témavezető: Mandeel Ali Raheem