The Document Components Ontology (DoCO)

Tracking #: 1016-2227

Alexandru Constantin
Silvio Peroni
Steve Pettifer
David Shotton1
Fabio Vitali

Responsible editor: 
Oscar Corcho

Submission type: 
Ontology Description
The availability in machine-readable form of descriptions of the structure of documents, as well as of the document discourse (e.g. the scientific discourse within scholarly articles), is crucial for facilitating semantic publishing and the overall comprehension of documents by both users and machines. In this paper we introduce DoCO, the Document Components Ontology, an OWL 2 DL ontology that provides a general-purpose structured vocabulary of document elements to describe both structural and rhetorical document components in RDF. In addition to describing the formal description of the ontology, this paper showcases its utility in practice in a variety of our own applications and other activities of the Semantic Publishing community that rely on DoCO to annotate and retrieve document components of scholarly articles.
Full PDF Version: