Reproducing research

The ability to verify results by re-doing experiments (observational, physical or computational) is fundamental to science. This page discusses how LABDRIVE supports this.

A discussion of the reproducibility of research referencing an NSF study make the point that one must distinguish between reproducibility, which is the ability to duplicate results using the same raw data (and procedures) whereas replicability is the ability of a study to duplicate results with newly collected data. Other, finer grained, definitions have also been proposed, particularly for medical and social science studies.

Use of Provenance in Reproducibility

As noted below, the Provenance Information of a Data Object is extremely important in terms of providing details of how that Data Object has been created - including what inputs, processes and parameters have been used.

LABDRIVE support for Reproducibility

LABDRIVE supports reproducibility, as defined as the ability to duplicate results using the same raw data and procedures, by being able to preserve:

  • the raw data - preserved as Data Objects, requiring little or no Representation Information, but adequate Provenance etc.

  • the procedures performed - as preserved in the form of text or scripting languages with additional information in the Provenance. Note that some of the Provenance Information may be in the header of the data files e.g. in FITS files.

  • the software used - preserved as described in Software as part of the RIN.

Reproducibility of computer based research using encapsulated complex software is discussed in terms of the usability of such software in:

Of course being able to preserve the procedures and software allows LABDRIVE users to also carry out replicability studies, by collecting fresh raw data.

These complex software set-ups may be useful in Exploiting preserved information.

Last updated