Raw material of Quaero: Corpus
Corpus is a horizontal project of the Quaero program which forms, with the CTC, the common core of upstream research and innovation . Corpus brings the raw material to Quaero: the collection and annotation of data constitutes great corpora which are necessary to the automatic processing of contents.
Under the direction of the Technical University of Aachen (RWTH), the Corpus project aims to:
- to collect data from the application projects, is a “ ground truth ” coming from real use conditions
- to develop large corpora of multimedia and multilingual data (processing, annotation, validation)
- to provide the CTC with real data allowing the development of key technologies in various fields (audio, translation, images, text, etc.)
- to provide test datafor Quaero's own experimentation campaigns