Enter the Ontotext Portal
(access is restricted to the partners of the project)
The Ontotext Portal provides an integrated access to the information extracted
automatically from the "Adige corpus" (Version A-500.000), which contains the
news stories published by the newspaper "L'Adige" from 1999 to 2006.
The corpus consists of almost 600,000 news stories, for a total of around 200 million
tokens and 12 million phrases.
The information extraction phase has been performed using technologies developed
within the Ontotext project.
Entity extraction through EntityPro has produced around 5.8, 3 and 3,1 million mentions
respectively for person, organization, and location entities.
This huge amount of mentions has been reduced through the co-reference module to
a final number of 630,591 persons, 278,244 organizations and 37,938 locations
that have been automatically added to the ontology.
Finally, 7,809 topic have been identified by OntoTDT and used to cluster 114,969
The Ontotext Portal offers ontology-driven content access, where search can be
performed according to standard modalities, i.e. by inserting one or more words
in a specific field.
Access to information consists of two separate phases.
In the first phase (semantic retrieval) the user is presented with all the entities and topics that
satisfy his/her query.
In the second phase, the user selects one of the entities and is presented with four different views
on such entity:
- Articles: all the news stories in which the entity has been mentioned
(the user can choose to view them in order of date or importance
- Citografo: shows the trend over time
(with flexible granularity) of the frequency with which the selected entity
is mentioned in the corpus
- Opinions: shows the trend over time
of the frequency with which opinions (positive or negative) are expressed
about the selected entity
- Record: The record provides extra information about the selected entity
Last modified: Tue Aug 28 2007