Research data is any information that has been collected, observed, generated or created to validate original research findings.
Data is fundamental to all research and all research has data in some form or other. Data does not just mean huge datasets but can be excel files, diaries, lab notes, photographs, interviews, and a myriad of other outputs of the research process.
Research data may be created by an individual researcher, group or contributed by a third party to a research group. Some examples are:
Observational: data captured in real time that is usually unique, e.g. sensing data, survey data, field recordings, sample data.
Experimental: data captured from lab equipment that is often reproducible. Examples are gene sequences, magnetic field data.
Models or simulations: data generated from test models where the model and metadata may be more important than the output data from the model. Examples are climate models, economic models.
Derived or complied: resulting from processing or combining “raw” data; often reproducible.
Reference or Canonical: a static or organic conglomeration or collection of datasets probably published and curated. Examples are gene sequence databanks, collections of letters, historical images.
Examples of outputs to be considered: documents, spreadsheets; scanned laboratory notebooks, field notebooks, diaries; online questionnaires, transcripts, surveys or codebooks; digital audiotapes, videotapes or other digital media; scanned photographs or films; transcribed test responses; database contents (video, audio, text, images); digital models, algorithms, scripts; contents of an application (input, output, log files for analysis software, simulation software, schemas); documented methodologies and workflows; records of standard operating procedures and protocols.
Increasingly, evidence of good data management has become a basic requirement of research funding. That aside, in embarking on a research project, it is clearly in a researcher’s interest to have a solid data management plan in place from the start; if only to avoid future heartache and wasted time through data loss, overcrowded file folders, and unanticipated legal/ethical roadblocks.
As well as the time, money and effort that go into the creation of data, it is worth considering its value in the long term, both for the researcher(s) and the community at large. UCD Library have provided an overview of the benefits of smart data management here.
If we value transparency, equality, and innovation in research, it is essential for data to be as open as possible:
“Open data is data that anyone can access, use and share.
“Open data becomes usable when made available in a common, machine-readable format.
“Open data must be licensed. Its licence must permit people to use the data in any way they want, including transforming, combining and sharing it with others, even commercially.” [European Commission]
Tim Berners-Lee devised a five-star step system to open data, demonstrating how easy making data open can be.
(Source: Wikipedia)
This work is licensed under CC BY-NC-SA 4.0