HPRC Data Use Protocol and Publication Protocol

Data Use Protocol and Publication Protocol | November, 2022

HPRC’s goals are to develop resources for the human pangenome. These resources will include primary sequence data, high-quality reference genomes, pangenome alignments of these reference genomes, a pangenome tool ecosystem, and reference annotation. The purpose of the HPRC data use and publication policy is to encourage collaboration and coordination among investigators while ensuring the scientific community's timely release of research and resources. The document provides the process for disclosure of planned publications within the HPRC through submission and general guidelines for non-consortium data use and publications.

Data Use

  • HPRC data is made publicly available to enable pre-publication sharing quickly. These data include primary sequencing data and derived data from genome assemblies, annotations, etc. HPRC data are openly available for use, as donors have given broad informed consent for re-use. Importantly, prepublication data may be available before quality control is complete, may be inaccurate, and may be subject to change.
  • Users of the HPRC data will not intentionally identify any participants who contributed biosamples to HPRC.
  • HPRC is committed to FAIR principles. Data are available through NHGRI AnVIL (https://anvilproject.org/) and the AWS public datasets program. Some data are also available through appropriate archives (e.g. GenBank and its INSDC partners).
  • HPRC data freezes are made periodically to reflect high-quality releases of subsets of the data.
  • HPRC data and data freezes are available for analysis by all researchers.
  • We refer to HPRC data that is either published or part of a data freeze that is unpublished but more than one year old as publicly released.
  • Researchers are encouraged to publish publicly released data without contacting the HPRC directly.
  • We request that researchers contact HPRC if publishing work based on HPRC data still needs to be publicly released. Such contact will allow us to coordinate publication and respect the contributions of those who generated the data. As mentioned above,, prepublication data may be available before quality control is complete, may be inaccurate, and may be subject to change. Prior contact with the HPRC will allow us to inform researchers of potential limitations.
  • While we ask for contact for publishing on not yet publicly released HPRC data, in general, it is okay to publish on non-publicly released HPRC data with minimal coordination if:
    • The work is restricted to a small subset of the data - for example, an individual chromosome.
    • The work principally demonstrates a methodological development and does not produce genome-wide results or findings.
  • For genome-wide publications on non-publicly released HPRC data, we request the inclusion of the HPRC Consortium banner as an author after evaluating the paper by the HPRC steering committee.
  • Researchers who use the HPRC data are encouraged to cite the latest integrated HPRC publication and reference the accession information for the resources used in their publications.
  • The author’s responsibility is to ensure that the use of the data follows the scope of service for each specific sample as defined by the HPRC data use table found on the HPRC website.

General Guidelines for Publication

  • Any publication must comply with the NIH Public Access policy, and authors should make efforts to publish their work using open-access approaches, acknowledging that sometimes unfettered open access is not the accepted pathway, e.g., Indigenous datasets and local communities with sovereign rights.
  • Individual investigators funded under the HPRC may separately or collaboratively publish the results of their work.
  • Investigators within HPRC who do similar work are encouraged to work together on publications.
  • An HPRC manuscript tracking sheet will share information about planned manuscripts.
    • Entries in the manuscript tracking sheet are the responsibility of the corresponding author.
    • The Information requested in the manuscript tracking sheet will include the following: Authors, Title, Status, Potential overlap with other HPRC Publications, Submission Date, Journal, Publication Date, Preprint Submission Link, PMCI Number.
    • The manuscript tracking sheet will be reviewed monthly at the Steering Committee meeting.
  • Internal Sharing of Planned Manuscripts within the HPRC
    • Investigators shall notify the Steering Committee of any publication using HPRC data or analysis that may impinge on planned integrative data analysis by the consortium. The potential impact of the pending publication can be discussed.
    • Investigators are encouraged to enter information into the tracking system early to enable collaboration and minimize overlap.
    • Information about manuscripts shared internally within the HPRC will be confidential and not be shared.
    • HPRC members are encouraged to share information about publications related to Human Reference, even if not directly funded by the HPRC.
  • External Sharing of Submitted Manuscripts
    • When relevant preprints of submitted manuscripts must be shared via the appropriate preprint server before or concurrently with the journal submission time.
    • Preprints are considered publications and should be cited like papers.
  • Authorship and Banner Publications
    • The HPRC Steering Committee will work with authors to decide when and whom to credit from the consortium, especially regarding the use of the entire consortium directory. The HPRC banner is especially relevant when authors consider using prepublication data.

Resolution of Disagreements

If questions or disagreements arise between investigators, they should resolve them together. The investigators may petition the Steering Committee to help resolve the situation if a mutually agreeable solution cannot be attained. The decision of the Steering Committee will be final, and all parties agree to abide by their decision.

Human Pangenome Reference Acknowledgement Statement

We would like to acknowledge the National Genome Research Institute (NHGRI) for funding the following grants which are in support of creating the human pangenome reference: 1U41HG010972, 1U01HG010971, 1U01HG010961, 1U01HG010973, 1U01HG010963, and the Human Pangenome Reference Consortium (https://humanpangenome.org/)


The HPRC Steering Committee needs to approve this protocol drafted by the Publication Working Group.