MyHeritage, Or Yehuda 6037606, Israel.
Department of Computer Science, Fu Foundation School of Engineering, Columbia University, New York, NY, USA.
Science. 2018 Nov 9;362(6415):690-694. doi: 10.1126/science.aau4832. Epub 2018 Oct 11.
Consumer genomics databases have reached the scale of millions of individuals. Recently, law enforcement authorities have exploited some of these databases to identify suspects via distant familial relatives. Using genomic data of 1.28 million individuals tested with consumer genomics, we investigated the power of this technique. We project that about 60% of the searches for individuals of European descent will result in a third-cousin or closer match, which theoretically allows their identification using demographic identifiers. Moreover, the technique could implicate nearly any U.S. individual of European descent in the near future. We demonstrate that the technique can also identify research participants of a public sequencing project. On the basis of these results, we propose a potential mitigation strategy and policy implications for human subject research.
消费者基因组数据库已达到数百万人的规模。最近,执法部门利用其中一些数据库通过远距离亲属关系来识别嫌疑人。我们使用经过消费者基因组测试的 128 万人的基因组数据,研究了该技术的效力。我们预计,约 60%的欧洲裔个体搜索结果将产生一个远房表亲或更亲近的匹配,这在理论上允许使用人口统计学标识符来识别他们。此外,该技术可能在不久的将来牵连到几乎所有的欧洲裔美国个体。我们证明,该技术还可以识别公共测序项目的研究参与者。基于这些结果,我们为人类受试者研究提出了一种潜在的缓解策略和政策影响。