NIH Data Sharing Policy: Requirements and Compliance
The NIH Data Sharing Policy governs how researchers who receive federal funding from the National Institutes of Health must manage, store, and make accessible the scientific data generated through that funding. The policy applies broadly across grant mechanisms, clinical trials, and intramural research, with specific thresholds and requirements that vary by funding level and data type. Understanding these requirements is essential for investigators, institutions, and sponsored programs offices seeking to maintain compliance and avoid jeopardizing future funding.
Definition and scope
The NIH Data Management and Data Sharing (DMDS) Policy, which took effect on January 25, 2023 (NIH NOT-OD-21-013), establishes the expectation that scientific data generated with NIH funding will be made available to other researchers in a timely manner, to the maximum extent possible. The policy applies to all NIH-funded research that generates scientific data, regardless of funding level — a significant expansion from the prior threshold, which required data sharing plans only for grants receiving more than $500,000 in direct costs in any single year.
"Scientific data" under this policy refers to recorded factual material commonly accepted in the scientific community as necessary to document, validate, and replicate research findings (NIH Scientific Data Definition, NOT-OD-21-013). This definition explicitly excludes laboratory notebooks, preliminary analyses, drafts of publications, peer reviews, and communications related to ongoing research.
The policy also distinguishes between data management and data sharing:
- Data management refers to the plans and actions taken during the course of research to protect data integrity, organize files, and document methodology.
- Data sharing refers to the planned release of datasets to third parties, typically via a designated repository, after research milestones or publication.
How it works
Compliance with the DMDS Policy is operationalized through the Data Management and Sharing Plan (DMSP), a document submitted as part of the grant application or prior to award, depending on the activity code. The DMSP must address six core elements as defined by NIH:
- Data type — description of the scientific data to be generated, including estimated volume and file formats.
- Related tools, software, and code — identification of any specialized tools necessary to access or interpret the data.
- Standards — metadata standards and common data elements that will be applied.
- Data preservation, access, and timelines — the repository where data will be deposited, the timeline for deposition, and the duration of availability.
- Access, distribution, or reuse considerations — whether any factors limit sharing (e.g., participant privacy, proprietary constraints, national security) and how those will be addressed.
- Oversight — how the institution and investigator will ensure the DMSP is followed throughout the project period.
NIH program officers review DMSPs for compliance. Approved DMSPs become terms of the award; failure to follow through with the stated plan can constitute a material violation of award conditions. Budget justification for data management and sharing costs is permitted under allowable costs, though NIH does not accept these costs as an indirect (F&A) charge (NIH Allowable Costs for DMS, NOT-OD-21-015).
For genomic data specifically, separate requirements under the NIH Genomic Data Sharing (GDS) Policy apply to studies generating large-scale human or non-human genomic data. That policy mandates submission to NIH-designated repositories such as the database of Genotypes and Phenotypes (dbGaP) for human genomic data.
Further details on how the DMDS Policy fits within NIH's broader regulatory framework are available through the NIH Policies and Regulations reference page on this site, which provides context across compliance domains.
Common scenarios
Scenario 1: Clinical trial with patient-level data
A funded clinical trial generating individual-level health records must submit a DMSP addressing de-identification or controlled access. Data from NIH-funded clinical trials is typically deposited in a repository that supports controlled access — such as dbGaP — where secondary researchers must apply for access, thereby protecting participant privacy under the Common Rule (45 CFR 46).
Scenario 2: Basic science lab with no human subjects
A laboratory generating proteomic or imaging data without human subjects involvement faces fewer access restrictions. The DMSP would name an appropriate domain repository (e.g., ProteomeXchange for proteomics, the NIH-supported Figshare for general datasets) and specify a deposition timeline, typically no later than the time of publication of findings.
Scenario 3: Small R21 exploratory grant
Even grants below the old $500,000 threshold now require a DMSP under the 2023 policy. An R21 investigator generating a limited dataset must still document data type, format, and intended repository — even if the conclusion is that the data volume or nature makes open sharing impractical, in which case a justification must be provided.
Decision boundaries
The table below outlines key distinctions that affect how the policy applies:
| Factor | Condition A | Condition B |
|---|---|---|
| Human subjects data | Controlled access required; IRB coordination necessary | Not applicable; open repository generally permitted |
| Genomic data (large-scale) | GDS Policy applies; dbGaP deposition required | Standard DMDS Policy applies |
| Funding level | All NIH-funded research (post-Jan 2023) | Pre-2023 awards: $500K+ direct cost threshold |
| Data sensitivity | Justification for limiting sharing must be documented in DMSP | No restriction; open deposition expected |
| Proprietary concerns | Licensing terms must be addressed; sharing may be delayed but not indefinitely waived | N/A |
Investigators uncertain about repository selection can consult NIH's list of Generalist Repositories or domain-specific guidance from the relevant NIH institute. A list of all NIH institutes and their specific data guidance can be found at NIH Institutes and Centers.
The boundary between "scientific data" (subject to sharing) and "research records" (not subject to sharing) is a recurring compliance question. NIH's position is that the data necessary to replicate published findings falls within scope; raw instrument outputs that are processed into a final dataset may or may not be required depending on field conventions and the DMSP terms agreed at time of award.
The DMDS Policy intersects with — but is distinct from — the NIH Open Access and Public Access Policy, which governs the publication of peer-reviewed manuscripts resulting from NIH funding. Both policies share the goal of maximizing public return on federally funded research, but they operate through separate compliance mechanisms and timelines. A broader overview of NIH's research enterprise and related compliance areas is accessible through the NIH Authority site index.