Chapter 26: Variant Interpretation and ACMG Guidelines
Johnson’s First Principle: The Clinical Standard
A p-value does not treat a patient. In clinical genomics, variant classification must be quantified, repeatable, and legally defensible. When a report enters a patient’s medical record, it becomes a legal document. Calling a benign variant pathogenic leads to unnecessary surgery; missing a pathogenic variant leads to missed treatment.
Core Concepts
The ACMG/AMP Five-Tier System
The American College of Medical Genetics and Genomics (ACMG) and the Association for Molecular Pathology (AMP) published the framework for variant interpretation in 2015 (Richards et al., Genetics in Medicine):
| Tier | Category | Clinical Action |
|---|---|---|
| 5 | Pathogenic | Confirmatory testing, family screening |
| 4 | Likely Pathogenic | Medical management (with caution) |
| 3 | Variant of Uncertain Significance (VUS) | No clinical action |
| 2 | Likely Benign | No action |
| 1 | Benign | No action |
Evidence Codes and Scoring
Pathogenic and benign classifications are built from evidence codes:
Pathogenic evidence codes: - PVS1 (very strong): Null variant (nonsense, frameshift, canonical splice site) in a gene where loss-of-function is a known disease mechanism. Critical exception: a null variant in the last exon or within 50 bp of the final exon junction typically escapes nonsense-mediated decay (NMD), producing a truncated protein that may retain partial function. Such variants require PVS1 downgrading to PS1 or PM4 depending on transcript structure — the strongest evidence code cannot be applied without confirming that NMD will be triggered. - PS1 (strong): Same amino acid change as a previously established pathogenic variant. - PM2 (moderate): Absent from population databases (gnomAD). Note: PM2 alone is not sufficient for pathogenicity — it only supports it when combined with other codes. - PP3 (supporting): Multiple in silico prediction tools (SIFT, PolyPhen, CADD, REVEL) predict damaging effect. PP3 is supporting evidence only — never strong.
Benign evidence codes: - BA1 (stand-alone): Allele frequency >5% in any population database. A single BA1 code can classify a variant as Benign. - BS1 (strong): Allele frequency >1% for recessive, >0.5% for dominant. - BP4 (supporting): In silico predictions suggest benign effect.
The scoring matrix combines codes to determine the final classification:
| Classification | Rule (minimum evidence combination) |
|---|---|
| Pathogenic | 1 PVS1 + 1 PS + 1 PM + 1 PP, or 1 PVS1 + 2 PS, or 3 PS, or 2 PS + 2 PM |
| Likely Pathogenic | 1 PVS1 + 1 PM, or 1 PS + 1 PM, or ≥3 PM, or 2 PM + 2 PP |
| VUS | Does not meet criteria for any other tier |
| Likely Benign | ≥2 BS codes, or 1 BS + 1 BP |
| Benign | 1 BA1 (stand-alone), or ≥2 BS |
The matrix intentionally makes Pathogenic harder to reach than Benign: a single BA1 code (population frequency >5% in any ancestral population) immediately classifies a variant as Benign, while Pathogenic always requires multiple independent lines of evidence.
gnomAD: Population Frequency Database
gnomAD (Karczewski et al., 2020) aggregates exome/genome data from 141,456 humans. Key statistics per variant: allele frequency (AF), number of carriers, and machine learning-based metrics including the constraint score (observed vs. expected variation in each gene). A gene under strong constraint (observed/expected ratio significantly <1) is more likely to cause disease when disrupted.
Critical caveat: gnomAD’s ancestry composition is predominantly European (56%). Variant frequencies in underrepresented populations (African, Latino, Indigenous) are less well-characterized, leading to more VUS calls in these populations.
In Silico Prediction Tools
Multiple computational tools predict variant deleteriousness: - SIFT: Conservation-based — predicts whether amino acid substitution affects protein function. - PolyPhen-2: Structure and sequence-based. - CADD: Integrates annotations into a single C-score by contrasting variants that survived natural selection against simulated variants. The score is Phred-scaled: CADD 10 means the variant is in the top 10% of deleteriousness genome-wide, 20 means top 1%, 30 means top 0.1%. The commonly used threshold of CADD > 20 as “likely deleterious” is a heuristic, not a clinically validated cutoff — it trades sensitivity for specificity and should never be used as the sole evidence for pathogenicity. - REVEL: Ensemble predictor combining 13 individual tools; recommended as the default for missense variant interpretation by current guidelines.
All in silico predictions are supporting evidence only (PP3/BP4), never strong. They cannot be used alone to classify a variant.
ClinVar and the VUS Crisis
ClinVar is the public database of human variant classifications submitted by clinical laboratories and research groups. Each submission has a review status (0-4 stars): 4-star submissions have been reviewed by an expert panel.
The VUS crisis: 40-50% of reported variants are classified as VUS, and this rate is 2x higher in underrepresented populations. A Black woman is twice as likely as a White woman to receive a VUS instead of a definitive classification for the same family history. This is a health equity crisis.
Biological Interpretation
The VUS crisis reflects fundamental uncertainty: most missense variants in the human genome have never been observed in a clinical context, and their functional effects cannot be confidently predicted. A VUS is not actionable — clinical action on a VUS is outside the standard of care. Patients who receive VUS results experience anxiety and may pursue unnecessary interventions.
The asymmetry of the ACMG system is intentional but creates a bias: it is easier to classify a variant as Benign (requiring only one BA1 code) than as Pathogenic (requiring multiple codes). This means that extremely rare variants that are actually benign remain classified as VUS indefinitely.
Phenotype plays a critical role but is often ignored in computational variant interpretation — a variant that fully segregates with disease in a large family with complete penetrance is strongly Pathogenic regardless of population frequency, but segregation evidence is often missing from clinical records.
Current Landscape (Q2 2026)
- AMP 2026 draft guidelines (final expected Q1 2026) add explicit liquid biopsy variant classification, a new E-level evidence tier for tumorigenic classification, and mandatory incidental germline findings review when analyzing tumor-only liquid biopsy data.
- ACMG/AMP V4 (points-based scoring system replacing static rules) is in development, with REVEL as the recommended default in silico predictor for missense variants.
- ClinGen gene-specific guidelines (PALB2, BRCA1/2, PTEN, TP53) continue to refine evidence code application rules per gene, adding domain-specific guidance.
- Functional assay data (saturation mutagenesis, MAVE) is increasingly accepted as strong evidence (PS3/BS3), accelerating VUS resolution by providing empirical measurements of variant function at scale.
Summary and Required Reading
- ACMG/AMP 5-tier system (Benign to Pathogenic) with evidence codes (PVS1, PS1, PM2, PP3, BA1) — codes combine via the scoring matrix.
- Population frequency (gnomAD) is the first filter — >5% AF = benign. Ancestry matching is critical.
- In silico predictions are supporting only (PP3) — never strong enough alone for clinical classification.
- The VUS crisis is a health equity crisis — 40-50% of variants are VUS, with 2x higher rates in underrepresented populations.
Required Reading
- Richards et al.: “Standards and guidelines for the interpretation of sequence variants” (Genetics in Medicine, 2015).
- Karczewski et al.: “The mutational constraint spectrum quantified from variation in 141,456 humans” (Nature, 2020).
Johnson’s Rule: If you would not change your own family’s treatment based on this variant, do not report it as pathogenic.