Resources & Tools
Within the Ontotext project we have developed the following resources and tools:
A module for the automatic recognition of named entities, based on machine learning techniques (Support Vector Machines). It is available within the TextPro platform for the linguistic analysis of texts.
Co-reference resolution module.
An algorithm to decide with a certain degree of confidence wether two mentions refer to the same entity or not.
A rule-based system for the automatic recognition and normalization of temporal expressions.
Italian Content Annotation Bank (I-CAB).
A corpus of Italian news stories annotated manually with temporal expressions, entities (e.g. persons, organizations, etc.), and relations between entities.
A lexical resource in which each WordNet synset is associated to three numerical scores describing respectively how objective, positive, and negative the terms contained in the synset are.
An unsupervised and language independent module for the detection of the topic/s dealt with in a collection of texts.
Aiming at building a large knowledge base from linguistic resources and web resources, a series of domain ontologies have been built in different fields, such as sport, broadcasters, local administration, arts and professions.
Last modified: Tue Aug 28 2007