What Not to Keep: Not All Data Has Future Research Value

Janice Yu Chen Kung, Sandy Campbell


The rise of academic library involvement in research data management has presented numerous challenges for academic libraries. While libraries and archives have always had collection development policies that defined what they would or would not collect, policies for selecting research data for preservation are in their infancy. This study surveyed and interviewed academic researchers. From this research an initial list of eight types of data were identified as research data that should not be preserved and made public by academic libraries and archives. These include research data that are sensitive or confidential, proprietary, easily replicable, do not have good metadata, are test, pilot or intermediate data, are bad or junk data, data that cannot be used by others for a variety of reasons, and older data that are not used and have no obvious cultural or historical value. Conclusions drawn from the study will help librarians and archivists make informed decisions about which types of research data are worth keeping.


health data; research data management; data curation; data preservation

Full Text:



Government of Canada. Science.gc.ca. [Internet]. (Ottawa; Canada); 2015. Draft Tri-agency statement of principles on digital data management; 2015 Jul 20; [cited 2015 Oct 17]. Available from: http://www.science.gc.ca/default.asp?lang=En&n=83F7624E-1.

Heidorn PB. The emerging role of libraries in data curation and e-science. Journal of Library Administration. 2011;51(7-8):662-672.

McCallum QE. Bad data handbook. Sebastopol: O’Reilly; 2012. 245 p.

McLure M, Level AV, Cranston CL, et al. Data curation: A study of researcher practices and needs. Libraries and the Academy. April 2014;14(2):139-164.

Environmental data management at NOAA: Archiving, stewardship, and access; 2007. Washington, D.C.: National Academies Press. Available from: http://www.nap.edu/catalog/12017.html

Tjalsma H, Rombouts J. Selection of research data: Guidelines for appraising and selecting research data; 2011. Den Haag en Delft. Stichting SURF, Data Archiving and Networked Services (DANS).

Lynch C. Research data management: Practical strategies for information professionals. West Lafayette, IN: Purdue University Press; 2013. The Next Generation of Challenges in the Curation of Scholarly Data; p. 395-408.

Government of Canada. Science.gc.ca [Internet]. (Ottawa, Canada); 2015. Frequently asked questions; 2015 Jul 28; [cited 2015 Nov 2]. Available from: http://www.science.gc.ca/default.asp?lang=En&n=A30EBB24-1

Cole G, Lloyd-Jones H, Evans J. What to keep/delete: How to appraise your data (RDP). PPT presented at Exeter; 2013; University of Exeter. Available from: http://hdl.handle.net/10871/8241

Dryad. Templates for correspondence [Internet]. Potentially inappropriate files: Human subject data; [revised 2015 Sept 24; cited 2015 Nov 25]. Available from: http://wiki.datadryad.org/Templates_for_Correspondence#Potentially_inappropriate_files:_Human_subject_data

Savage CJ, Vickers AJ. Empirical study of data sharing by authors publishing in PLoS journals. PLoS ONE. 2009;4(9):e7078. Available from: http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0007078

DOI: http://dx.doi.org/10.5596/c16-013


  • There are currently no refbacks.