A pipeline framework for Natural Language Processing
The targeted audience includes projects that require usual Natural
Language Processing tools for production and research purpose.
Alvis NLP is a pipeline framework to annotate text documents using
Natural Language Processing (NLP) tools for sentence and word
segmentation, named-entity recognition, term analysis, semantic typing
and relation extraction (see the paper by Nedellec et al. in Handbook on
Ontologies 2009 for a comprehensive overview).
The various available functions are accessible as modules, that can be composed in a sequence forming the pipeline. This sequence, as well as parameters for the modules, is speciﬁed through a XML-based conﬁguration ﬁle.
New components can easily be integrated into the pipeline. To implement a
new module, one has to build a Java class manipulating text annotations
following the data model deﬁned in Alvis NLP.
The class is loaded at run-time by Alvis NLP, which makes the integration much easier.
Java 7 Weka
Sources available upon request. Free for uses for academic institutions.