Posts about Data Management

Adding Value through Data Curation

Libbie Stephenson, UCLA, and Jared Lyle, ICPSR / Umich, shared information about requirements for access to data generated from federally funded projects

Jean Sack shared her notes from the session:

2003 Data sharing for grants over $200,000. During 2011-13 at least 20 USGA Agencies (over $100m) to respond to increasing access to the results of Federally Funded Scientific research and studies.

Example website for data storage: Posting data, saving plans

Data access can benefit through saved funds, openness for scientists. In a 2011 poll, 85% UK researchers thought data would be of interest but less than ½ have made data available. In USA 33-43% never shared data.

Max access = accessible, complete and self-explanatory, usable

Protect confidentiality= if human subjects, must mask identities; ICPSR secures downloads, virtual data enclave (lockdown browser), physical enclave 6400 restricted use dataset with 2000 agreements.

Appropriate attributes = need citation references for data used ICPSR’s data citations page, IASSIST’s Quick Guide to Data Citation, Datacite (persistent identifier DOI)

Long-term preservation = floppies? Flash drives used currently. Formats important

Data management planning = start when applying for funding for preservation, access, digital formats

(some federal agencies will withhold 10% of grant to deny if data not properly saved for access) Private foundations like Ford, Hewlett are now requiring data management plans (Laurie Calhoun)

ICPSR has sample elements of data management plans on website – ICPSR Collection Development Policy shows scope of repository (supply letters of support to show donors that data is secure and meets requirements of ICPSR) Once images, websites were proposed but steered to different repository. Could attach the DOI (needed for all datasets) as a hyperlink to data to an abstract. Agencies may issue requirements to publish data sooner once results published in journals (embargo will be limited) – Open Access supplementary materials,

NIH biocady/ data discover index; USAID has an open data policy now (Chris) to give access to data immediately.

Data documentation initiative (DDI) formats show which metadata needed. Intellectual property rights must be decided, Creative commons CC-0 removes all rights vs limited access. Formats are important such as SPSS, ASCII, media files. Where will you deposit? Multiple copies? Ability to migrate from one format to another? Storage and backup plans, links to similar data, quality assurance procedures to clean, security and permissions, names of those responsible for curation, cleaning, archiving. What is the budget for preservation – mandated data access does not give extra money so it MUST BE BUDGETED in grant (data preparation!! And management!! Pay for archiving in repository). How long should data be held?

Print copies of two documents circulated:

  1. Guide to Social Science Data Preparation and Archiving 5th edition. 2012
  2. ICPSR Guide to Archiving Social Social Science Data for Institutional Repositories 1st Edition

Q How long does ICPSR take to give access to data

A there is a queue now usually month(s). Sometimes delays are because the PI never contacted ICPSR (looks very bad!)

A ICPSR is a consortium of 700 organizations – NICHD has a topic index. Open ICPSR gives open access to anyone (from members) and can be searched through Google or BING. lists 1000 repositories around the world,

Q is there any problem of duel archiving in institution and also in ICPSR?

A does it really matter if you can put in a link? BUT if you change or add to data, both versions need to be edited – which is up-to-date? Most data libraries only commit to store…for 10 years. At ICPSR the data is kept in usable formats and migrated. In journal supplements the zipped data files are potentially dated in future (will Excel be used 10 years from now).

Libbie Stephenson – resources available to us

Data Curation Profiles Toolkit in–

Helps in meeting with researchers about their project data, how long they want to keep it, intermediary files tend not be shared – only final is accessed, what resources are offered? Sometimes code is more important than the data if running simulations,

DMP Template Tool from University California (Discover UC3)

Managing and sharing Data: Best Practices for researchers excellent narrative resource from the UK

Libbie says that frequent checks with PIs about their data management plans is the best idea as federal grants data management requirements are changing. Definitely need copies of questionnaires and codebooks! Librarians should be trained to do curation with appraisal tools! She is training interns from Information Studies schools. Collectica has tools to discover and evaluate data use for other projects…

Peer, Green and Stephenson, IDCC, February 2014 has article on data quality, processes

Who are stakeholders & their roles, policies, usability over long term, staff competencies, finances

Standards and certification may be needed (System_architecture_for_Digital_Preservation_Neil_Jefferies)

Q is training in survey research and data collection/preservation taught in graduate programs?

A Doctoral seminars are given by Libbie at UCLA. Oregon State has courses with syllabi on line, Minnesota has courses. Summer 5 day course at ICPSR is for researchers, archivists to curate and manage data. Green, MacDonald & Rice: Policy making for Research Data in Repositories: A Guide




Leave a Comment