Briefings in Bioinformatics Advance Access published online on December 6, 2008
Briefings in Bioinformatics, doi:10.1093/bib/bbn044
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Linked data and provenance in biological data webs
Corresponding author: Jun Zhao, Image Bioinformatics Research Group, Department of Zoology, University of Oxford, The Tinbergen Building, South Parks Road, Oxford, OX1 3PS, UK. Tel: 0044-(0)1865 281 094; Fax: 0044-(0)1865 271 211; E-mail: jun.zhao{at}zoo.ox.ac.uk
The Web is now being used as a platform for publishing and linking life science data. The Web's linking architecture can be exploited to join heterogeneous data from multiple sources. However, as data are frequently being updated in a decentralized environment, provenance information becomes critical to providing reliable and trustworthy services to scientists. This article presents design patterns for representing and querying provenance information relating to mapping links between heterogeneous data from sources in the domain of functional genomics. We illustrate the use of named resource description framework (RDF) graphs at different levels of granularity to make provenance assertions about linked data, and demonstrate that these assertions are sufficient to support requirements including data currency, integrity, evidential support and historical queries.
Keywords: linked data, provenance, trust, named graphs, semantic web
Submitted: July 31, 2008. Received (in revised form): September 24, 2008.