NIH All of Us Research Program: Precision Medicine Initiative

The NIH All of Us Research Program is a landmark national cohort study designed to accelerate precision medicine by building one of the most diverse biomedical research databases in history. Authorized under the 21st Century Cures Act (Pub. L. 114-255, signed December 2016), the program aims to enroll at least 1 million participants across the United States. This page covers the program's definition and scope, its operational mechanisms, the scenarios in which it applies, and the boundaries that govern researcher and participant decisions.


Definition and scope

The All of Us Research Program is a long-term effort coordinated by the NIH to gather health data — including biological samples, electronic health records, surveys, and wearable device outputs — from a participant cohort explicitly recruited to reflect the demographic diversity of the US population. The program operates under the Office of the Director at NIH and is administered through the All of Us Research Program Office (allofus.nih.gov).

The scope extends beyond any single disease or condition. Rather than targeting a specific illness, the program generates a foundational resource for investigators studying the interplay between genetics, environment, lifestyle, and health outcomes. As documented by the NIH All of Us Research Program, the enrolled cohort is designed so that more than 50 percent of participants come from racial and ethnic groups historically underrepresented in biomedical research — a structural design choice intended to correct systematic gaps in existing genomic and clinical datasets.

The program's scope connects directly to NIH's broader research priorities and initiatives, particularly the Precision Medicine Initiative launched in 2015 under the Obama administration, which identified large-scale cohort building as a foundational federal strategy.


How it works

The program operates through a distributed network of healthcare provider organizations, known as Healthcare Provider Organizations (HPOs), alongside community-based enrollment centers and a direct-volunteer pathway. The operational sequence proceeds as follows:

  1. Enrollment and consent — Participants enroll online or at a physical enrollment site, provide electronic informed consent, and authorize the program to access their electronic health records.
  2. Biospecimen collection — Participants provide blood and urine samples at collection sites. Samples are processed and stored at the All of Us Biobank, operated by Mayo Clinic in Rochester, Minnesota, under contract with NIH.
  3. Survey completion — Participants complete standardized surveys covering lifestyle factors, family health history, social determinants of health, and physical measurements.
  4. Genomic analysis — Collected biospecimens undergo whole genome sequencing. As of the program's v7 data release, the Researcher Workbench contained whole genome sequence data for more than 245,000 participants (All of Us Research Hub, Data Release Notes).
  5. Data linkage — Survey responses, EHR data, and genomic data are integrated into a unified participant record within a controlled-access cloud environment.
  6. Researcher access — Qualified researchers register through the All of Us Researcher Workbench, complete required training, and access de-identified or limited datasets depending on their approved protocol tier.

The Researcher Workbench is hosted on Google Cloud Platform under a FISMA-compliant security framework, and all data use is governed by the All of Us Data Use Policies, which prohibit re-identification attempts and require annual attestation of compliance.


Common scenarios

The All of Us dataset applies across a range of investigative contexts:


Decision boundaries

Not all research questions or investigators qualify for All of Us data access. The program distinguishes between two primary access tiers:

Registered Tier vs. Controlled Tier

Feature Registered Tier Controlled Tier
Genomic data Aggregated only Individual-level whole genome sequences
EHR data De-identified, aggregate Longitudinal individual records
Training required Basic data use training Extended responsible conduct of research modules
IRB requirement Not required for aggregate Required for individual-level access
Re-identification risk Low Managed through data use agreement

Research proposals involving controlled-tier data must comply with the NIH data sharing policy and are subject to review under the All of Us Data Access Review Committee. Proposals that involve linking All of Us data to external datasets require additional approval, because cross-dataset linkage increases re-identification risk.

The program does not permit commercial product development as a primary research aim. Investigators affiliated with for-profit entities may access the Workbench but must disclose commercial intent during registration. All publications arising from All of Us data must acknowledge the program per citation requirements published at allofus.nih.gov/publications.

Participants retain the right to withdraw from the program at any time; upon withdrawal, no new data is collected, though previously contributed data already incorporated into research datasets may be retained under standard scientific integrity principles.


References