Chapter 29: Cancer Genomics and Somatic Evolution

Published

June 5, 2026

Modified

June 19, 2026

Johnson’s First Principle: Cancer is Darwinian Micro-Evolution

Cancer is not a static disease. It is an accelerated Darwinian ecosystem compressed into a single human lifetime. A tumor is not a single population of identical cells; it is a collection of competing clones, each with different mutations, each vying for resources. Chemotherapy is not just treatment; it is the most powerful evolutionary selection pressure the tumor has encountered.

Core Concepts

Germline vs. Somatic Calling

Germline variant callers expect variant allele frequencies (VAF) of ~50% (heterozygous) or 100% (homozygous). Somatic callers must detect mutations at any VAF from 1-50%, requiring:

A matched normal sample (blood or adjacent normal tissue) to filter germline polymorphisms
A different prior that accommodates low VAFs
Cross-sample contamination modeling

A matched normal is not optional; population databases (gnomAD) cannot substitute, as they exclude rare private germline variants that are the most likely to be mistaken for somatic mutations.

Clonal Evolution

The Nowell (1976) clonal evolution model: a single founding cell acquires a driver mutation, proliferates, and accumulates additional mutations. Subclones emerge through branching evolution:

Founding clone (APC mutation)
  ├── Subclone A (KRAS mutation) → Primary tumor dominant
  │     └── Subclone A1 (TP53 mutation) → Metastasis
  └── Subclone B (PIK3CA mutation) → Resistant to therapy

VAF-based subclone detection: Each mutation’s VAF estimates the proportion of tumor cells carrying it, but this relationship depends on tumor purity \(p\); the fraction of sequenced DNA derived from cancer cells (vs. infiltrating normal cells or stroma). For a clonal heterozygous mutation (present in all cancer cells) in a diploid region, the cancer cell fraction (CCF) is:

\[CCF = \frac{\text{VAF}_{\text{obs}}}{p}\]

In a pure tumor (\(p = 1.0\)), expected VAF = 0.5. At \(p = 0.25\) (common in low-cellularity biopsies), the same clonal mutation has expected VAF ≈ 0.125; diluted by normal cells to the point of being undetectable by standard callers. Tumor purity is estimated from sequencing data by tools (ASCAT, PureCN, ABSOLUTE) that model the joint distribution of VAF and copy number across all heterozygous SNPs. After purity correction, VAFs are transformed to CCFs: subclonal mutations have CCF < 1, while clonal mutations have CCF = 1 (present in every cancer cell). Tools like PyClone and SciClone cluster mutations by CCF to infer subclonal architecture.

Liquid Biopsy and ctDNA

Circulating tumor DNA (ctDNA) is fragmented DNA (~167 bp, apoptotic origin) released into the bloodstream by tumor cells:

MRD (molecular residual disease) monitoring: Detect ctDNA after surgery to identify patients with residual disease before radiographic evidence appears.
Resistance detection: Rising VAF of a resistance mutation (e.g., EGFR T790M) precedes radiographic progression by 3-6 months.
Sampling bias solution: A single biopsy captures only the dominant clone from one tumor region. ctDNA captures the aggregate mutational landscape of all metastatic sites simultaneously.

Critical caveat: clonal hematopoiesis (CHIP). Mutations in hematopoietic stem cells (most commonly DNMT3A, TET2, ASXL1) accumulate with age and are shed into the bloodstream at levels indistinguishable from tumor ctDNA. These CHIP mutations produce false-positive ctDNA results in 10-20% of liquid biopsy tests, particularly in patients over 60. Distinguishing CHIP from tumor-derived ctDNA requires paired white blood cell sequencing; if the mutation is present in the buffy coat fraction, it is CHIP, not tumor-derived. Laboratories that omit this correction report systematically inflated ctDNA detection rates and increased false-positive calls for MRD monitoring.

Tumor Mutational Burden (TMB)

TMB is the number of somatic mutations per megabase of sequenced coding DNA. It is an FDA-approved tissue-agnostic biomarker for pembrolizumab (anti-PD-1) response in solid tumors: TMB ≥ 10 mutations/Mb predicts immunotherapy benefit. The biological rationale: more mutations → more neoantigens → higher probability of T-cell recognition → better immunotherapy response.

Microsatellite Instability (MSI)

MSI is a hypermutator phenotype caused by mismatch repair deficiency (MMRd). MSI-high tumors accumulate hundreds of thousands of mutations at microsatellite repeats. MSI-H is an FDA-approved tissue-agnostic biomarker for immune checkpoint inhibitors. MSI testing by sequencing compares repeat lengths between tumor and normal; instability at ≥2 of 5 Bethesda markers is called MSI-H.

Mutational Signatures

Every mutational process leaves a characteristic imprint on the genome. The trinucleotide context of each mutation (the bases immediately 5’ and 3’ of the mutated position) reveals the underlying process. COSMIC catalogs ~80 mutational signatures across human cancer:

Signature	Associated Process	Mutation Pattern
SBS4	Tobacco smoke (lung cancer)	C>A transversions, enriched at CpG
SBS3	BRCA1/2 deficiency (HRD)	Large deletions, duplication rearrangements
SBS11	Temozolomide treatment	C>T transitions at CpG
SBS1	Spontaneous deamination (aging)	C>T at CpG, universal across cancers

Mutational signature analysis is a deconvolution problem: each tumor’s mutational profile is a mixture of the signatures active in that tumor, and the activity of each signature is estimated by fitting the observed trinucleotide mutation counts to the known signature matrix. A dominant SBS4 signature in a non-smoker suggests an incidental finding requiring clinical follow-up; dominant SBS3 in a triple-negative breast cancer patient suggests PARP inhibitor eligibility regardless of BRCA1/2 mutation status.

Biological Interpretation

A single biopsy of a metastatic tumor is a single frame of a movie. One biopsy captures only the dominant clone; the resistance-causing subclone may be in a different region of the same tumor or a different metastatic site. Multi-region sequencing studies (TRACERx, PEACE) reveal that up to 50% of driver mutations in metastatic sites are undetectable in the primary tumor biopsy.

ctDNA overcomes this sampling bias: liquid biopsy captures the aggregate mutational landscape from all metastatic sites simultaneously. Rising ctDNA VAF of a resistance mutation precedes radiographic progression by 3-6 months, enabling earlier therapy switching.

Subclonal architecture is the difference between therapeutic success and failure. A mutation present in 100% of tumor cells (clonal) is an ideal therapeutic target, targeting it kills all cells. A mutation present in 5% of cells (subclonal) is a target only for that subclone; the remaining 95% are unaffected, and the resistant subclone expands under selective pressure.

Summary and Required Reading

Somatic calling requires a matched normal: population databases cannot substitute.
Clonal evolution produces heterogeneous subclones; VAF-based detection infers subclonal architecture.
ctDNA overcomes sampling bias: liquid biopsy captures aggregate mutational landscape across all metastatic sites.
TMB and MSI are tissue-agnostic biomarkers for immunotherapy response, based on neoantigen burden.

Required Reading

Alexandrov et al.: “The repertoire of mutational signatures in human cancer” (Nature, 2020).
Wan et al.: “Liquid biopsies come of age” (Nature Reviews Cancer, 2017).

Johnson’s Rule: A single biopsy of a metastatic tumor is a single frame of a movie. The resistance-causing subclone may be in a different region entirely.