Skip to main content

A Data Management and Sharing (DMS) plan is a formal document that describes the scientific data to be generated or used in research projects and outlines specific provisions for ensuring that the data are FAIR: findable, accessible, interoperable, and reusable. The DMS plan should also take into account any legal, ethical, or technical issues that may limit data sharing.

Getting Started

Before you begin writing your data management and sharing plan, we recommend reviewing your funding agency’s data management and sharing policy. The policy will fully describe the funding agency’s expectations for managing, preserving, and sharing research data.

Most research at UNC goes through one of these five funding agencies:

National Institutes of HealthScientific Data Sharing Policy
National Science FoundationData Management Plan Requirements
NSF Public Access Plan 2.0
Department of DefensePlan to Establish Public Access to the Results of Federally Funded Research
National Institute of Environmental Health SciencesData Management and Sharing Policies
National Cancer InstituteData Sharing and Public Access Policies

The current status of all federal funding agency public access plans and guidance can be found at Science.gov

Once you’ve reviewed the funding agency policy, we recommend gathering all materials from your current grant proposal, even if they are merely drafts. These documents will more than likely contain information relevant to your data management and sharing plan.


Writing the Plan

A data management and sharing plan should be about two pages. Typically, a plan requires the following information:

  • Data types, formats, and estimated size
  • Documentation and metadata standards
  • Roles and responsibilities
  • Security and storage
  • Sharing and preservation
  • Access restrictions, limitations, licensing

This is not an exhaustive list. Some policies may require other information, so be certain to note those requirements.

A useful tool for writing your plan is the DMPTool. It provides templates that include all required components of a funding agency’s data management and sharing policy. Note: UNC researchers must use the DMPTool to draft their DMSP in order to submit it for a DMSP review.

Data Types, Formats, and Estimated Size

Describe the expected types of data that will be generated during your research project. Include a description of how the data will be generated/collected as well. For instance, will your research produce sequencing, imaging, or experimental data? How will that data be generated?

Provide an estimate of how much data you anticipate will be collected as well. How many participants or experiments will be conducted? What amount of data are you expecting to collect (provide estimated file sizes)?

Along with the types and amount of data being generated during your project, it is important to describe the expected data formats your research will be stored and shared in. Are you using a community standard file format? Are the file formats stable enough for long-term preservation?

Please see our RDM Guidance for recommended file formats for a variety of data types. The Data Curation Network has data primers available with file formats that lend themselves to preservation and sharing.

When describing your data formats, we also recommend listing any software required to access those data. Some things to consider are whether your data can be read into open-source software that is readily available and used by your research community. If not, provide information on the specific proprietary software being used in your research.

Metadata Standards and Documentation

A metadata standard is adopted by a research community as a means of describing research data to facilitate discovery, re-use, and understanding. Users should be able to understand what they can and cannot do with your data, how the data were collected, who collected the data, and the purpose of the study. Please note that some communities have metadata standards for describing data while others may not have an adopted standard.

The Research Data Alliance has put together a comprehensive metadata standard catalog which can be browsed by scheme and subject or searched across various fields like funder and data type.

Additionally, when identifying a data repository for sharing your data, you should look at what metadata standards they support and whether that is appropriate for your data needs.

Documentation further describing your research data, methodological approach, compute environment and statistical software, and any manipulation of your data should also be addressed within your data management plan. These documents should be provided to help users further understand the context and contents of your research outputs. Documentation can include, but is not limited to, codebooks, data dictionaries, READMEs, study protocols, survey instruments, and methodology reports. If your funding agency asks for information regarding documentation, describe the types of documentation you will be sharing along with your research data.

Roles and Responsibilities

A key component of a data management plan is identifying the person(s) responsible for managing and sharing your data. These roles can belong to one person or across team members. In some cases, your research project may warrant hiring a data archivist or data steward to complete the curation and data repository deposits at the end of your project.

Make sure to clearly describe who will perform which actions on the expected research data from your project. If you need to hire staff to handle these tasks, be sure to include that within your budget. Note: many funding agencies are aware of the costs necessary to ensure data management activities are completed; therefore, it is expected that these costs be included in the proposed budget.

You can learn more about identifying roles and responsibilities in the RDM Guidance: Plan – Identify Roles.

Data Security and Storage

Information on where your data will be stored, who will have access, and how potentially identifying data will be kept secure should be included throughout your data management and sharing plan. This information will tie into access restrictions and limitations to sharing.

A few things to consider when drafting this information:

  1. Will the data generated during this project include personally identifiable information (PII) or protected health information (PHI)? If so, how will the data be protected during and after collection and analysis?
  2. Should you consult with your department IT or ITS Research Computing for a secure storage solution? How long will you need to keep these data secure? Who will have access to these data? Can they be requested with a data use agreement and IRB approval?

We recommend consulting with external collaborators, expertise, and/or university services during the planning phase to ensure you account for costs and time within your proposal and DMSP.

Data Sharing and Preservation

Your data management plan should include information on how and where you will be sharing and preserving the generated research data from your project. What repository(ies) will you be using? Will the data be publicly available for download, or will users need to request access? How long will your data be made available within the data repository?

The first step to answering these questions is to identify an appropriate, trustworthy data repository. Review your funding agency’s data sharing guidance to see if they require data to be shared in a designated repository. If not, you will need to locate the most appropriate repository for your data.

Domain-specific data repositories
Domain-specific data repositories are built to support and preserve specific data formats from their designated research community. These repositories usually offer features and metadata standards relevant to their domain and have staff and expertise available to answer questions related to preserving commonly used data types. There are many domain-specific data repositories available; however, they do not exist for all data types and/or disciplines.

The NIH Repositories for Sharing Scientific Data lets researchers search by keyword or institute/center. They also have guidance for Selecting a Data Repository which describes the desirable characteristics a data repository should have to meet the requirements of the NIH data management and sharing policy. The Registry of Research Data Repositories (RE3) is another useful database for searching for domain-specific data repositories. RE3 allows users to search by keyword or browse by content type, subject, or country. Results will display whether the repository offers licensing, open access to data, and uses persistent identifiers.

Generalist data repositories
If a domain-specific data repository doesn’t exist in your field, then a generalist data repository is most likely an appropriate fit for sharing and preserving your research data. A generalist data repository will share and preserve any data regardless of file format, type, or discipline. There are a handful of popular generalist data repositories available to researchers:

UNC Dataverse (for UNC researchers)
Dryad
Figshare
Mendeley Data
Zenodo

Once you have identified an appropriate data repository, include information about when and how the data will be shared within that platform. You will also want to describe how the data will be made findable within the repository and for how long. Does it use persistent identifiers? What is the repository’s commitment to maintaining access to their holdings?

Access Restrictions, Limitations, and Licensing

Funding agencies are aware that not all data generated and used to report the results of a funded project can be fully shared due to ethical, technical, or legal limitations. If you will be collecting data that may be too sensitive, may have legal consequences, or may be too large to share entirely, describe those factors within the DMSP and provide information on how you will make as much data available as possible and under what conditions.

A few questions to ask yourself:

  1. Can I de-identify and clean the data sufficiently to ensure participant privacy is not compromised while also maintaining the utility of these data? If not, can the data be stored securely and requested through a secure transfer protocol and data use agreement?
  2. For big data, are there subsets of the data that I can share that will provide users with enough information for re-use? If not, can a data access process be created that permits users to analyze the data via the institution’s equipment or is there a way to transfer data through a secure protocol?

Funders ask that researchers make a best effort to share as much data as feasibly possible. If there are questions about your expected limitations for sharing, they will ask for clarification.


Guidance for Funding Agency Policies

Each funding agency has their own data sharing requirements and therefore, their own guidance. Please thoroughly review the guidance below and visit the DMPTool templates to get a better sense of what will be required in your data management and sharing plan.

Funding AgencyFunder GuidanceRDMC GuidanceTemplates
National Institutes of HealthNIH GuidanceNIH GuidanceDMPTool template
National Science FoundationNSF GuidanceDMPTool template
Department of DefenseDMPTool template
National Institute of Environmental Health SciencesNIH GuidanceDMPTool template
National Cancer InstituteNIH GuidanceDMPTool template

Example Data Management Plans

There are many examples of data management plans available for review. We recommend reading through a few DMSPs to see what content is included and how information is described.

Example DMS Plans Directory
DMPTool Public Plans
NIH Sample Plans