Explainable Deep Learning Models for Text-to-Speech Conversational AI
Conversational AI uses machine learning to develop speech-based apps that allow humans to interact naturally with devices, machines, and computers using audio. Several deep learning models are connected to a pipeline to build a conversational AI application. This project aims to study and refine the TTS part in one Conversational AI toolkit (for example, NVIDIA NeMo or SpeechBrain).