On the Semiotics of Speech Signals
Speech signals are naturally considered as the carrier of spoken language used for human-to-human and human-to-machine communication. In semiotics, the theoretical construction of a "sign" has a much wider scope and is not restricted to the relation between the linguistic concept and the sound pattern. Other signs are hidden in the speech signal and point beyond the linguistic content, they need not be created intentionally and they need not arise from a human speaker. Exploiting this much wider notion of a sign for engineering applications requires the systematic description, detection, and interpretation of all signs contained in speech signals. To this end, we will discriminate four sign classes in the semiotics of speech signals:
The first class are language-related signs, spanning all linguistic aspects from phonetics to discourse, including the question which language or regional variety is used for communication.
The second class comprises speaker-related signs such as gender, age, speech ability or disability, individual speaking tempo, general health state, physiological expressions of stress and emotions, personal identity and the difference human vs. machine.
The third class addresses the influence of the environment, i.e., ambiance-related signs. This includes the detection, localization, and tracking of the speaker in a room, environmental characteristics such as reverberation and background noise, or the insertion of comfort noise. Similarly, specific listening conditions arising from technical applications can impress their signs such as acoustic echo or speech augmentation for playback in loud environments.
The fourth class allows inference on the technical processes used for transmission or recording of speech signals. Examples are transmission bandwidth limitation or extension, compression with its specific losses and error concealment mechanisms, in-band signalling and data hiding with digital and analog watermarks, e.g., as used for the unique identification of the speech acquisition path or device.
The lecture will illustrate a selection from these classes using simple experiments and it will explain some emerging engineering applications.
Gernot Kubin was born in Vienna, Austria, on June 24, 1960. He received his Dipl.-Ing. (1982) and Dr.techn. (1990, sub auspiciis praesidentis) degrees in Electrical Engineering from TU Vienna. He is Professor of Nonlinear Signal Processing and head of the Signal Processing and Speech Communication Laboratory (SPSC), the Broadband Communications Laboratory, and the Computer Engineering Laboratory at TU Graz/Austria since 2000, 2004, and 2011, respectively. He acted as Dean of Studies in EE-Audio Engineering 2004-2007 and as Chair of the Senate 2007-2010, and he has coordinated the Doctoral School in Information and Communications Engineering since 2007. Earlier international appointments include: CERN Geneva/CH 1980, TU Vienna 1983-2000, Erwin Schroedinger Fellow at Philips Natuurkundig Laboratorium Eindhoven/NL 1985, AT&T Bell Labs Murray Hill/USA 1992-1993 and 1995, KTH Stockholm/S 1998, and Global IP Sound Sweden&USA 2000-2001 and 2006, UC San Diego & UC Berkeley/USA 2006, and UT Danang, Vietnam 2009. In 2011, he has co-founded Synvo GmbH, a joint start-up of ETH Zurich/CH and TU Graz in the area of speech synthesis for mobile devices, and he holds leading positions in several national research centres for academia-industry collaboration such as the Vienna Telecommunications Research Centre FTW 1999-now (Key Researcher and Board of Governors), the Christian Doppler Laboratory for Nonlinear Signal Processing 2002-2010 (Founding Director), the Competence Network for Advanced Speech Technologies COAST 2006-now (Scientific Director), the COMET Excellence Project Advanced Audio Processing AAP 2008-now (Key Researcher), and in the National Research Network on Signal and Information Processing in Science and Engineering SISE 2008-2011 (Principal Investigator) funded by the Austrian Science Fund. Dr.Kubin is a Member of the Board, Austrian Acoustics Association, and of the Speech and Language Processing Technical Committee of the IEEE. His research interests are in nonlinear signals and systems, digital communications, computational intelligence, and speech communication. He has authored or co-authored over one hundred forty peer-reviewed publications and ten patents.