Skip to Main Content

DSRH

Data

(What it is, why it is important and what needs to be done).

Data often outlives the projects that create them. Increasingly funders request or require that their funding recipients create and follow plans for managing data, storing or preserving it for the future and making some or all of their data available on open access. It is accepted that not all data will be available for sharing due to commercial or ethical considerations so there may well be an opt-out clause in the funder’s policy.

Data Nightmare

 

What it is

Research data may be created by an individual researcher, group or contributed by a third party to a research group. Some examples are:

Observational: data captured in real time that is usually unique. Eg. Sensing data, survey data, field recordings, sample data: Experimental: data captured from lab equipment that is often reproducible-examples are gene sequences, magnetic field data: Models or simulations: data generated from test models where the model and metadata may be more important than the output data from the model - examples are  climate models, economic models: Derived or complied: resulting from processing or combining “raw” data often reproducible: Reference or Canonical: a static or organic conglomeration or collection of datasets probably published and curated - examples are gene sequence databanks, collections of letters, historical images.

Examples of outputs to be considered

Documents, spreadsheets

Scanned laboratory notebooks, field notebooks, diaries

Online questionnaires, transcripts, surveys or codebooks

Digital audiotapes, videotapes or other digital media

Scanned photographs or films

Transcribed Test Responses

Database contents (video, audio, text, images)

Digital Models, algorithms, scripts

Contents of an application (input, output, log files for analysis software, simulation software, schemas)

Documented methodologies and workflows

Records of standard operating procedures and protocols 

Why manage Data?

Data is a digital asset and needs to be managed so as

  1. To ensure research integrity and validation of results.
  2. To increase research efficiency.
  3. To facilitate data security and minimise the risk of loss.
  4. To ensure wider dissemination, increased impact and online sharing.
  5. To enable research continuity through secondary data use.
  6. To comply with funders’ requirements. 

Having a good plan means you can find and understand your data when you need to, there is continuity if people leave or join the project, you avoid unnecessary duplication, you link your data and publications together and ultimately your research is more visible and has greater impact (data attracts citations).  For the Sciences the expectation of the data is that it can be used for factual purposes and to validate the research. For the Arts & Humanities it is more about evidence for the production of new knowledge (this may very well be more subjective, ephemeral and tacit). If data is not managed properly, it can quickly become lost or unusable because of obsolete file formats, hardware etc.

REMEMBER: DO NOTHING + DATA = NO DATA