Release Timeline
May 2023
Release 1: A first draft of the human pangenome reference. This pangenome is composed of 47 phased, diploid genome assemblies (94 haplotypes) from a cohort of diverse individuals selected from 1000 Genomes Project (1KG) samples. The released assemblies cover >99% of the expected sequence in each genome at >99% accuracy. Accompanying these genome assemblies are detailed pangenomic alignments that map the similarities and differences between the genomes as well as functional annotations of each assembly, including detailed gene annotations of each assembly. As an open resource, all sequence data is freely, publicly available without restriction. These sequence data include Pacific Biosciences HiFi long-read sequencing, Oxford Nanopore Technologies (ONT) ultra-long sequencing, Dovetail Hi-C short-read sequencing data, Illumina sequencing for each sample as well as their parents, epigenetic data, and the genome assemblies.
Spring 2025
Release 2: An intermediate release comprising more than 200 samples (over 400 haplotypes). In addition to the >4x increase in genomes, the samples were assembled using more advanced algorithms that take fuller advantage of integrating the HiFi, ONT, and Hi-C sequencing data to create significantly more contiguous and more structurally accurate assemblies. An additional base-level assembly polishing step has also been incorporated to reduce the number of single nucleotide and short insertion and deletion (indel) errors by more than half across the large majority of the assembled sequence. For genome annotations, we also added long-read HiFi transcriptome data from cell lines of most of the samples that went into creating the pangenome. This release is designed to refine the initial human pangenome draft. Accompanying this intermediate release is an increasingly mature set of analysis tools and pipelines that exploit this pangenome data.
Spring/Summer 2026
Release 3: A stable release comprising more than 350 assemblies (>700 haplotypes). This release will add genomes through new recruitment efforts, including those from the BioMe collection in New York City and data generated and/or QC’d by International Partners through the International Human Pangenome Project. The genomes in this release are expected to be essentially complete, telomere-to-telomere assemblies in which each haplotype of each chromosome is fully and accurately assembled.