Quaero in short

Coordinator: Technicolor

Discover Quaero in a click !

Quaero (“I seek” in Latin) is a collaborative industrial research and innovation program adressing automatic processing of multimedia and multilingual content.  Coordinated by Technicolor, this program gathers 32 French and German public and private partners, leaders in their field of research.

The program is founded on a single organization around three pillars:

  • eight application projects, carried out by industrialists, who develop innovations targeting the needs for identified markets;
  • A structure of shared research which facilitates the technology transfer towards the application projects: the Core Technology Cluster (CTC) and the associated Corpus project;
  • The principle of coopétition by comparative experimentation which leads to the most promising solutions.
Application projects

Evaluation campaigns and coopétition

Quaero developed a unique system of evaluation of the technologies developed within project CTC through annual evaluation campaigns. This comparative evaluation enables the establishment of reliable measuring instruments in the field of automatic processing of contents. By a stimulative competition of the partners ( coopétition), it fosters the best solutions.

The comparative evaluation campaigns allow to:

  • to evaluate objectively , by a third party, the methods and models developed whithin the CTC (Core Technlogy Cluster)
  • to find the most productive technological solution for the application projects
  • to intensify communication and exchanges of good practices between the partners
  • to develop measuring instruments and to reinforcethe evaluation infrastructure in Europe.

Within this competitive framework, the coopétition enables to improve of the efficency of the project. It is not a duplication of the efforts: on the opposite, coopétition boosts the symbiosis and the exchanges of best practice between the actors of the project.


Evaluation in figures
  • 30 internal evaluation campaigns
Core Technology Cluster Corpus

The Quaero research centre: the Core Technology Cluster (CTC)

The Core Technology Cluster project gathers most of the upstream research activities within the academic and industrial laboratories. In relation with the Corpus project, it forms the common core of research wtihin the Quaero program.

The CTC plays three main roles :

  1. to improve the state of the art of indexing and structuring of multimedia and multilingual documents technologies
  2. to develop key technologies for the eight application projects of Quaero
  3. to measure of the techologies progress during evaluation campaigns in a coopétition dynamics .

The CTC focuses on eight research topics :

  • text processing,
  • translation,
  • speech processing,
  • audio and music processing,
  • video and image processing,
  • the multimodal structuring of audio-visual contents,
  • techniques of search and browsing in multimedia contents
  • data protection

The CTC in figures
  • 24 partners
  • 35 nationalities
  • 75 core technology modules developed and transferred in the application prototypes
  • More than 700 publications
  • 50 theses
  • 70 participations in national and international evaluation campaigns (ranking generally in the top 3)
  • 23 internal evaluation campaigns
  • 16 distinctions (best paper, young researcher prize, thesis prize, CNRS Crystal medal)

Raw material of Quaero: Corpus

Corpus is a horizontal project of the Quaero program which forms, with the CTC, the common core of upstream research and innovation . Corpus brings the raw material to Quaero: the collection and annotation of data constitutes great corpora which are necessary to the automatic processing of contents.

Under the direction of the Technical University of Aachen (RWTH), the Corpus project aims to:

  • to collect data from the application projects, is a “ ground truth ” coming from real use conditions
  • to develop large corpora of multimedia and multilingual data (processing, annotation, validation)
  • to provide the CTC with real data allowing the development of key technologies in various fields (audio, translation, images, text, etc.)
  • to provide test datafor Quaero's own experimentation campaigns

Corpus in figures
  • 26 partners
  • 1859 hours of transcribed audio from several sources in 9 languages
  • 4.6 billion annotated words
  • several hundreds of thousands of images, of which: 500,000 images by visual categories, 250,000 newspaper pages, 100 million images for research
  • 1556 hours of annotated videos
  • several corpora made available to the community, of which:
    • 800 hours of video corpus with 346 concepts for international evaluation campaigns TRECVID
    • a database of 518 visual concepts containing 500,000 images (approximately 1000 per concept)
    • 2 text corpora for the international challenges BioNLP 2013 (ontological acquisition and semantic annotation)
    • 2 corpora of 1.5 million words each, made available by ELRA : resulting from oral radio and broadcast data, and the other from the written press of 1890.
Cards Portraits and interviews