Download the Technology Catalog in pdf here.
Download the Technology Catalog in pdf here.
Companies who want to integrate the transcription of human speech into their products.
Speech-to-Text technology is key to indexing multimedia content as it is
found in multimedia databases or in video and audio collections on the
World Wide Web, and to make it searchable by human queries. In addition, it offers a natural interface for submitting and executing queries.
This technology is further part of speech-translation services. In combination with machine translation technology, it is possible to
design machines that take human speech as input and translate it into a
new language. This can be used to enable human-to-human combination across the language barrier or to access languages in a cross-lingual way.
All content providers may be interested in the automatic characterization of the in-home audience. When personalization of video – either Video on Demand (VoD) or broadcast – or ads is targeted, these same providers will see an interest in having this module to help the automatic personalization of provided content.
The audience characterization module may also be used by end users to manage their own content at home. Furthermore, content providers and advertisers will be interested in the
‘level of attention’ information provided by the module.
Provided content personalization
Knowing what the audience is:
End-user application, Question-Answering is the easiest way to find information for everybody: ask the question as you want and obtain answers, not snippets or pages.
Search and find precise answers in any collection of texts, from the Web
or any other source (voice recognition, optical character recognition,
etc.), with eventual correction of the source text, ability to generate
questions from generic requests, eventually a single word, ability to
find similar questions and their answers, etc.
Monolingual and multilingual Question-Answering system. Languages: English, French (+ Spanish, Portuguese, Polish, with partners using the same API).
Domain-speciﬁc communities, especially technical and scientiﬁc, willing
to build search engines and information systems to manage documents with
ﬁne-grained semantic annotations.
Search engines and information systems development.
Bank, Insurance, Administration, Telecom and Utility Companies, Historical Document Conversion
Everyone who has to deal with forms containing handwritten fields or to process incoming mails
The targeted users and customers of language recognition technologies
are actors in the multimedia and call center sectors, including academic
and industrial organizations, as well as actors in the defense domain,
interested in the processing of audio documents, and in particular if
the collection of documents contains multiple languages.
A language identification system can be run prior to a speech recognizer. Its output is used to load the appropriate language dependent speech recognition models for the audio document.
Alternatively, the language identification might be used to dispatch
audio documents or telephone calls to a human operators fluent in the
corresponding identified language.
Other potential applications also involve the use of LID as a front-end to a multi-lingual translation system. This technology can also be part of automatic system for spoken data retrieval or automatic enriched transcriptions.
The targeted users and customers are the multimedia industry actors, and any content and service provider with speech data.
The targeted users and customers of speech-to-text transcription
technologies are actors in the multimedia and call center sector,
including academic and industrial organizations interested in the
automatic mining processing of audio or audiovisual documents.
This core technology can serve as the basis for a variety of applications: multilingual audio indexing, teleconference transcription, telephone speech analytics, transcription of speeches, subtitling…
Large vocabulary continuous speech recognition is the key technology for
enabling content-based information access in audio and audiovisual
documents. Most of the linguistic information is encoded in the audio channel of
audiovisual data, which once transcribed can be accessed using
Via speech recognition, spoken document retrieval can support random
access using specific criteria to relevant portions of audio documents,
reducing the time needed to identify recordings in large multimedia
databases. Some applications are data-mining, news-on-demand, and
Question-answering is an end-user application. The purpose is to go beyond the traditional way of retrieving information through search engines. Our system is interactive, with both a speech (phone or microphone) and text (web) interface.
QA system can be viewed as a direct extension of search engines. They allow a user to ask questions in natural language.