Population Sampling & Representation

A comprehensive collection of diverse and highly accurate human reference genomes is critical to a complete understanding of our genetic variation. Samples in this project will be selected to improve the representation of human genetic diversity, which will provide a richer and more diverse human genomic reference resource. A richer human reference map promises to improve our understanding of genomics and our ability to predict, diagnose and treat disease. The HPRC will select at least 350 cell lines from individuals who offer ancestral genetic diversity and consent for unrestricted access data release.

Initial Sampling Effort

The first 200 samples included in the project will be from the 1000 genomes (1KG) collection at Coriell. Selection of the 1KG lines, will be prioritized based upon samples’ ability to cover genetic and geographic diversity, availability of low passage cell lines to avoid artifacts that may arise during extensive cell culture, and a subset will be prioritized due to the availability of trios/parental data to assist in the technical challenges of phased assembly release.

Future Sampling

We acknowledge that references generated from 1000 genome samples alone are insufficient to capture the extent of sequence diversity in the human population. To ensure that we are able to maximize our surveys of sample diversity, we will broaden our efforts to recruit new participants into this study, and establish a consent and open access data release and cell-based resources.