The Ontotext Portal provides an integrated access to the information extracted automatically from the "Adige corpus" (Version A-500.000), which contains the news stories published by the newspaper "L'Adige" from 1999 to 2006. The corpus consists of almost 600,000 news stories, for a total of around 200 million tokens and 12 million phrases.

The information extraction phase has been performed using technologies developed within the Ontotext project. Entity extraction through EntityPro has produced around 5.8, 3 and 3,1 million mentions respectively for person, organization, and location entities. This huge amount of mentions has been reduced through the co-reference module to a final number of 630,591 persons, 278,244 organizations and 37,938 locations that have been automatically added to the ontology. Finally, 7,809 topic have been identified by OntoTDT and used to cluster 114,969 new stories.

The Ontotext Portal offers ontology-driven content access, where search can be performed according to standard modalities, i.e. by inserting one or more words in a specific field.

Access to information consists of two separate phases.
In the first phase (semantic retrieval) the user is presented with all the entities and topics that satisfy his/her query.
In the second phase, the user selects one of the entities and is presented with four different views on such entity:

