Department of Genetics, Harvard Medical School, Boston, MA 02115, USA.
Proc Natl Acad Sci U S A. 2012 Jul 24;109(30):11920-7. doi: 10.1073/pnas.1201904109. Epub 2012 Jul 13.
Rapid advances in DNA sequencing promise to enable new diagnostics and individualized therapies. Achieving personalized medicine, however, will require extensive research on highly reidentifiable, integrated datasets of genomic and health information. To assist with this, participants in the Personal Genome Project choose to forgo privacy via our institutional review board- approved "open consent" process. The contribution of public data and samples facilitates both scientific discovery and standardization of methods. We present our findings after enrollment of more than 1,800 participants, including whole-genome sequencing of 10 pilot participant genomes (the PGP-10). We introduce the Genome-Environment-Trait Evidence (GET-Evidence) system. This tool automatically processes genomes and prioritizes both published and novel variants for interpretation. In the process of reviewing the presumed healthy PGP-10 genomes, we find numerous literature references implying serious disease. Although it is sometimes impossible to rule out a late-onset effect, stringent evidence requirements can address the high rate of incidental findings. To that end we develop a peer production system for recording and organizing variant evaluations according to standard evidence guidelines, creating a public forum for reaching consensus on interpretation of clinically relevant variants. Genome analysis becomes a two-step process: using a prioritized list to record variant evaluations, then automatically sorting reviewed variants using these annotations. Genome data, health and trait information, participant samples, and variant interpretations are all shared in the public domain-we invite others to review our results using our participant samples and contribute to our interpretations. We offer our public resource and methods to further personalized medical research.
DNA 测序的快速发展有望带来新的诊断方法和个体化治疗。然而,要实现个性化医疗,还需要对基因组和健康信息的高度可识别、整合数据集进行广泛研究。为了协助这一目标的实现,参与个人基因组计划的参与者通过我们机构审查委员会批准的“开放同意”程序选择放弃隐私。公共数据和样本的贡献既促进了科学发现,也促进了方法的标准化。在超过 1800 名参与者入组后,我们介绍了研究结果,其中包括 10 名试点参与者基因组的全基因组测序(PGP-10)。我们引入了基因组-环境-特征证据(GET-Evidence)系统。该工具可自动处理基因组,并优先对已发表和新的变体进行解释。在审查假定健康的 PGP-10 基因组的过程中,我们发现了许多文献参考资料,暗示存在严重疾病。虽然有时无法排除迟发性效应,但严格的证据要求可以解决偶然发现的高发生率问题。为此,我们开发了一种同行生产系统,根据标准证据指南记录和组织变体评估,为临床相关变体解释达成共识创建一个公共论坛。基因组分析成为一个两步过程:使用优先级列表记录变体评估,然后使用这些注释自动对已审查的变体进行排序。基因组数据、健康和特征信息、参与者样本以及变体解释均在公共领域共享——我们邀请其他人使用我们的参与者样本审查我们的结果,并为我们的解释做出贡献。我们提供我们的公共资源和方法,以进一步推进个性化医学研究。