NIH Peer Review Process: How Applications Are Evaluated

The National Institutes of Health peer review system is the primary mechanism by which the federal government evaluates competing grant applications before committing public research funds. This page covers the full evaluation sequence — from initial assignment through funding recommendation — including the scoring methodology, the roles of standing study sections, common structural failure modes in applications, and the tensions inherent in peer-based scientific judgment. Understanding this system is essential for investigators, institutions, and policymakers engaged with NIH grant mechanisms and funding decisions.

Definition and scope
Core mechanics or structure
Causal relationships or drivers
Classification boundaries
Tradeoffs and tensions
Common misconceptions
Checklist or steps (non-advisory)
Reference table or matrix

Definition and scope

NIH peer review is a two-tiered federal evaluation process mandated under the Public Health Service Act (42 U.S.C. § 284) and implemented through policies administered by the Center for Scientific Review (CSR). Its statutory purpose is to ensure that federal biomedical research funds are awarded on the basis of scientific merit, as assessed by qualified experts independent of the funding agency.

The scope of peer review encompasses all research grant applications submitted to NIH, including R01, R21, R03, and P-series program project grants, as well as fellowship applications (F-series) and training grants (T-series). Contracts and some intramural activities operate under separate procurement and review mechanisms and fall outside the peer review framework described here. The NIH grant application process governs the submission stage that precedes peer review.

The Center for Scientific Review, established as NIH's central receipt and referral office, manages approximately 70,000 grant applications per year (NIH Center for Scientific Review) and routes each one to an appropriate Scientific Review Group (SRG), commonly called a study section.

Core mechanics or structure

Tier 1 — Scientific Review Group (SRG)

The first tier is conducted by a standing or special-emphasis study section composed of 20 to 30 extramural scientists appointed by CSR. Reviewers are selected based on expertise relevant to the application pool assigned to that panel. Each application receives three assigned reviewers: a primary, a secondary, and a tertiary (reader).

Before the panel meets, the three assigned reviewers independently score five scored review criteria using a 9-point scale, where 1 represents exceptional/outstanding and 9 represents poor. The five criteria, as defined by NIH (NIH Peer Review Criteria), are:

Significance — whether the study addresses an important problem
Investigators — qualifications and experience of the research team
Innovation — novelty relative to existing knowledge and methods
Approach — rigor, feasibility, and methodology
Environment — adequacy of institutional and research setting

After individual scoring, reviewers submit written critiques. Applications falling in the bottom half of scientific merit (as determined by average preliminary scores) are "triaged" — not discussed at the panel meeting. Applicants receive written critiques but no impact score for triaged applications.

Applications above the triage threshold are discussed at the full panel meeting. After discussion, all eligible reviewers — not just the three assigned — cast independent scores on the same 9-point scale. The arithmetic mean of all submitted scores is multiplied by 10 to produce the final impact score, ranging from 10 (best) to 90 (worst).

The percentile rank is then calculated by comparing an application's impact score against the distribution of impact scores for all scored applications reviewed by the same SRG over the preceding three review cycles. Percentile rank, not raw impact score, is the primary metric used by institute advisory councils and program staff in funding decisions.

Tier 2 — National Advisory Council

The second review tier is conducted by the National Advisory Council or Board of each NIH Institute or Center. Councils meet three times annually and review summary statements and funding recommendations produced by Tier 1. The council can approve, defer, or — in rare circumstances — recommend against funding an application the study section scored favorably. Council approval is required before any award can be issued (NIH Advisory Councils).

Causal relationships or drivers

The structure of peer review reflects deliberate design choices shaped by competing pressures. The percentile rank system emerged because raw impact scores vary systematically across study sections — a score of 20 from one panel may represent a different level of merit than a 20 from another. Percentilization normalizes across these panel-level scoring cultures.

Triage exists as a workload management mechanism. When NIH application volumes increased significantly following the budget doubling between fiscal years 1998 and 2003 (NIH Budget History, Office of Budget), study sections faced panel meetings that would have been unmanageable without a mechanism to exclude clearly non-competitive applications from full discussion.

The five-criterion framework was last substantially revised in 2009 as part of the Enhancing Peer Review initiative, which also introduced the current 9-point scoring scale (NIH Enhancing Peer Review). That revision replaced a 5-point scale that had produced score compression at the high end — a structural problem where too many applications clustered at 1.0 to 1.5, reducing the system's ability to rank-order competing proposals.

Classification boundaries

Not all applications follow the standard SRG pathway. The following boundaries define when alternative review processes apply:

Special Emphasis Panels (SEPs): Convened for applications with unusual conflicts of interest, highly specialized topics without an appropriate standing study section, or resubmissions after prior review irregularities. SEPs use the same criteria and scoring methodology.
Expedited review: Some small grant mechanisms (R03) and exploratory grants (R21) receive expedited review with fewer required preliminary data, but the same five-criterion scoring.
Fellowship and training grant review: F31, F32, and T32 applications are scored on different weighted criteria, with greater emphasis on the training environment and sponsor qualifications than on the research project itself.
SBIR/STTR applications: Small Business Innovation Research and Small Business Technology Transfer applications are reviewed by specialized study sections under CSR but evaluated against additional commercialization criteria. The NIH small business grants process covers this pathway in detail.
Administrative supplements: Supplements to existing awards typically bypass full peer review and are evaluated administratively by program officers within the funding institute.

Tradeoffs and tensions

The peer review system generates documented tensions that have been the subject of repeated NIH working group analyses.

Conservatism versus innovation: The Approach criterion rewards methodological rigor and feasibility, which tends to favor incremental science over high-risk proposals. Studies of grant review outcomes — including analyses published in peer-reviewed journals citing NIH-funded bibliometric data — have found that highly novel proposals frequently receive lower Approach scores, creating structural pressure against paradigm-shifting research.

Reviewer expertise versus breadth: CSR study sections are organized around discipline clusters, but interdisciplinary applications frequently fall between established panel expertise areas. Misassignment to a panel lacking relevant expertise is a documented driver of poor scores unrelated to scientific merit.

Transparency versus confidentiality: Reviewer identities are not disclosed to applicants, protecting reviewers from professional retaliation. This confidentiality, however, limits applicants' ability to identify and formally contest potential conflicts of interest beyond the standard conflict-of-interest disclosure process.

Percentile rank versus absolute merit: Percentilization stabilizes comparisons across panels but means that a funding payline set at the 10th percentile in a highly competitive study section may exclude applications that would fund comfortably in a less competitive pool. The NIH budget and federal funding allocation page describes how paylines are set at the institute level.

Resubmission policy: NIH limits applicants to one resubmission (A1) after an initial unfunded application (A0). Applications not funded after the A1 submission must be substantially redesigned and resubmitted as new (A0) applications. This policy, revised in 2011, was intended to reduce review burden but has been contested by investigators who argue it disadvantages projects requiring iterative reviewer feedback.

Common misconceptions

Misconception: A high impact score guarantees funding.
Correction: Funding decisions rest with the institute, not the study section. Program officers and institute advisory councils weigh strategic priorities, portfolio balance, and payline thresholds. An application scoring at the 8th percentile may not be funded if the institute's payline is the 10th percentile and the institute has no mechanism to fund above the line without special justification.

Misconception: Triaged applications cannot be resubmitted.
Correction: Triage (previously called "unscored" or "streamlined") is a discussion decision, not a rejection. Applicants receive written critiques from the three assigned reviewers and may revise and resubmit. The triage outcome does not appear on the funded/not-funded record used by program officers.

Misconception: Reviewers are NIH employees.
Correction: Study section members are extramural scientists — faculty and researchers at universities, hospitals, and research institutions — who serve as Special Government Employees (SGEs). NIH Scientific Review Officers (SROs), who are federal employees, manage panel logistics and ensure procedural compliance but do not score applications.

Misconception: The same percentile rank means the same thing across institutes.
Correction: Each NIH Institute and Center sets its own payline and funding strategy. A 12th-percentile application may fund at one institute and not fund at another, depending on that institute's appropriated budget, portfolio priorities, and the volume of applications it receives.

Checklist or steps (non-advisory)

The following sequence describes the peer review lifecycle as it operates from submission to award decision:

Reference table or matrix

NIH Review Criteria: Scoring Dimensions Compared by Application Type

Criterion	R01 / R21 / R03	F-Series Fellowship	T-Series Training	SBIR / STTR
Significance	Scored (1–9)	Scored	Scored	Scored
Investigators	Scored (1–9)	Heavily weighted (applicant + sponsor)	Weighted (program directors)	Scored (team + commercialization)
Innovation	Scored (1–9)	Scored	Less emphasized	Scored
Approach	Scored (1–9)	Scored (training plan emphasis)	Scored (training program)	Scored + commercialization potential
Environment	Scored (1–9)	Heavily weighted (training environment)	Heavily weighted	Scored (facilities + resources)
Commercialization	Not applicable	Not applicable	Not applicable	Required criterion
Overall Impact Score	Yes (10–90 scale)	Yes	Yes	Yes
Percentile Rank	Yes	Limited (smaller pools)	Limited	Yes

Sources: NIH Peer Review Criteria for Research Grants; NIH Fellowship Review Criteria; NIH SBIR/STTR Review

The full landscape of NIH operations — from organizational authority through research program priorities — is indexed at the NIH Authority site index, which provides navigational access to all major topic areas covered across this reference resource.