Download the Technology Catalog in pdf here.
Download the Technology Catalog in pdf here.
Companies who want to integrate the transcription of human speech into their products.
Speech-to-Text technology is key to indexing multimedia content as it is
found in multimedia databases or in video and audio collections on the
World Wide Web, and to make it searchable by human queries. In addition, it offers a natural interface for submitting and executing queries.
This technology is further part of speech-translation services. In combination with machine translation technology, it is possible to
design machines that take human speech as input and translate it into a
new language. This can be used to enable human-to-human combination across the language barrier or to access languages in a cross-lingual way.
The primary use of TyDI is the design of termino-ontologies for the indexation of textual documents. It can therefore be of great help for most projects involved in natural language processing.
End-user application, Question-Answering is the easiest way to find information for everybody: ask the question as you want and obtain answers, not snippets or pages.
Search and find precise answers in any collection of texts, from the Web
or any other source (voice recognition, optical character recognition,
etc.), with eventual correction of the source text, ability to generate
questions from generic requests, eventually a single word, ability to
find similar questions and their answers, etc.
Monolingual and multilingual Question-Answering system. Languages: English, French (+ Spanish, Portuguese, Polish, with partners using the same API).
The targeted users and customers are all the Internet actors providing
personalized services to their users, interested by integrating
recommender systems that are more respectful of their privacy.
Classification of music items are generally primarily based on their belonging to a music genre: electronica, jazz, pop/ rock… However, the editorial meta-data related
to the genre are generally only accessible at the artist level (the
whole set of music tracks produced by one artist will belong to the same
music genre whatever the tracks content). Ircammusicgenre is a software which allows the automatic estimation of the belonging of a music track to music genres. The list of music genres considered by the software can be
pre-determined by Ircam (electronica, jazz, pop/rock…) or can be adapted
to categories relevant to the partner, provided a sufficient number of
sound examples per category.
Ircammusicgenre also allows to perform multi-labeling of a music track,
i.e. assigning a set of genre labels instead of a single genre. In this case, a weighting is assigned to each estimated label.
Ircammusicmood a software which allows the automatic estimate of the music mood of has music track to music mood. Music mood report to the “mood” that a track suggests: positive, sad, powerful, calming…
As for the music genre, the list can be predetermined by Ircam or discussed with the partner. Multi-labels can also be applied to the music mood classification.
The targeted users and customers of speech-to-text transcription
technologies are actors in the multimedia and call center sector,
including academic and industrial organizations interested in the
automatic mining processing of audio or audiovisual documents.
This core technology can serve as the basis for a variety of applications: multilingual audio indexing, teleconference transcription, telephone speech analytics, transcription of speeches, subtitling…
Large vocabulary continuous speech recognition is the key technology for
enabling content-based information access in audio and audiovisual
documents. Most of the linguistic information is encoded in the audio channel of
audiovisual data, which once transcribed can be accessed using
Via speech recognition, spoken document retrieval can support random
access using specific criteria to relevant portions of audio documents,
reducing the time needed to identify recordings in large multimedia
databases. Some applications are data-mining, news-on-demand, and
Ircamaudiosim allows the development of music recommendation based on music content similarity. It can therefore be used for any system (online or offline) requiring music recommendation, such as for the development of a recommendation engine for online music service or offline music collection browsing.
Bank, Insurance, Administration, Telecom and Utility Companies, Historical Document Conversion
All content providers may be interested in the automatic characterization of the in-home audience. When personalization of video – either Video on Demand (VoD) or broadcast – or ads is targeted, these same providers will see an interest in having this module to help the automatic personalization of provided content.
The audience characterization module may also be used by end users to manage their own content at home. Furthermore, content providers and advertisers will be interested in the
‘level of attention’ information provided by the module.
Provided content personalization
Knowing what the audience is:
Personal/professional managers of video collections