Automatic speech recognition, also known as speech-to-text, is the
transcription of speech into (machine-readable) text by a computer
The use of automatic speech recognition is so manifold that it is hard to list here. The main usages today are customer interaction via the telephone,
healthcare dictation and usage on car navigation systems and
smartphones. These applications will with increasingly better technology extend to
audio mining, speech translation and an increased use of human computer
interaction via speech.
Automatic speech recognition is a very hard problem in computer science but more mature than machine translation.
After a media hype at the end of the 1990’s, the technology has
continuously improved and it has been adopted by the market, e.g. in
large deployments in the customer contact sector, in the automation in
radiology dictation, or in voice enabled navigation systems in the
Public awareness has increased through the use on smart-phones, in particular Siri. The research community concentrates on problems such as the recognition
of spontaneous speech or the easy acquisition of new languages.
Speech translation is a computationally and memory-intensive process, so
the typical set-up is to have one or several computers in the internet
serving the speech translation requirements of many users.
RWTH provides on open-source speech recognizer free of charge for academic usage.
Other usage should be subject to a bilateral agreement.