Exploring Efficient Neural Architectures for Text-to-Speech Synthesis
Text-to-Speech (TTS) is a comprehensive technology that involves many disciplines such as acoustics, signal processing, and machine learning. This project focuses on developing a deep-learning model designed to provide a high-quality TTS system. The student's task is mapping from linguistic to acoustic features with various deep neural networks. Students must evaluate the updated system from different aspects, including intelligibility, naturalness, and preference for synthetic speech.