Lecture: Advanced Speech Processing, Winter Term 2025/2026

  • Instructor: Prof. Dr. Emanuël Habets
  • Teaching Assistant: TBD
  • Time: Winter Term 2025/2026, Tuesday's 14:15-15:45
  • Place: Am Wolfsmantel 33, Erlangen-Tennenlohe, Room 3R4.04
  • Format: Lecture
  • Credits: 2,5 ECTS
  • Exam (graded): Oral examination at the end of the term

News

The first lecture will be held on October 28th at 14.15.

Format

The lecture has the following format:

  • Every meeting consists of 90 minutes

For further information, please contact Prof. Dr. Emanuël Habets.

Content

Speech is at the core of human communication and increasingly central to our interaction with technology. From voice assistants and teleconferencing to hearing aids, security applications, and immersive media, speech technologies must perform robustly in real-world acoustic environments. These environments are often far from ideal: noise, reverberation, and interfering sources can severely degrade the quality and intelligibility of speech signals. At the same time, advances in machine learning and signal processing have opened new opportunities for creating, modifying, and analyzing speech in powerful ways.

This lecture provides a comprehensive introduction to advanced speech processing, covering both classical and modern neural approaches. Topics include:

  • Speech quality and intelligibility assessment: objective and subjective methods for evaluating speech processing algorithms.
  • Speech enhancement: noise reduction and dereverberation with classical signal processing and deep learning.
  • Speech extraction and separation: isolating target speakers or signals from complex mixtures.
  • Beamforming: spatial filtering to enhance speech captured by microphone arrays using classical array processing and deep learning.
  • Text-to-speech synthesis (TTS): generating natural and expressive speech from text with modern neural architectures.
  • Voice anonymization: transforming speech signals to protect privacy while preserving intelligibility.
  • Speaker identification and verification: modeling and recognizing speaker characteristics for personalization and security.

The lecture combines theoretical foundations, algorithmic insights, and practical demonstrations. Students will gain an understanding of both classical methods and cutting-edge neural approaches, and their application in real-world scenarios.

Target audience: This lecture is designed for graduate students and researchers interested in speech and audio technology. By the end of the lecture, participants will have a strong foundation to understand, design, and critically evaluate methods in advanced speech processing.

Course Material

The lecture slides can be downloaded on StudOn.

Links

Further audio-related courses offered by the AudioLabs can be found at: