Suppr超能文献

对由千人基因组计划生成的MC1R高通量核苷酸测序数据的评估。

Evaluation of MC1R high-throughput nucleotide sequencing data generated by the 1000 Genomes Project.

作者信息

Marano Leonardo Arduino, Marcorin Letícia, Castelli Erick da Cruz, Mendes-Junior Celso Teixeira

机构信息

Departamento de Genética, Faculdade de Medicina de Ribeirão Preto, Universidade de São Paulo, Ribeirão Preto, SP, Brazil.

Departamento de Química, Laboratório de Pesquisas Forenses e Genômicas, Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto, Universidade de São Paulo, Ribeirão Preto, SP, Brazil.

出版信息

Genet Mol Biol. 2017 Apr-Jun;40(2):530-539. doi: 10.1590/1678-4685-GMB-2016-0180. Epub 2017 May 8.

Abstract

The advent of next-generation sequencing allows simultaneous processing of several genomic regions/individuals, increasing the availability and accuracy of whole-genome data. However, these new approaches may present some errors and bias due to alignment, genotype calling, and imputation methods. Despite these flaws, data obtained by next-generation sequencing can be valuable for population and evolutionary studies of specific genes, such as genes related to how pigmentation evolved among populations, one of the main topics in human evolutionary biology. Melanocortin-1 receptor (MC1R) is one of the most studied genes involved in pigmentation variation. As MC1R has already been suggested to affect melanogenesis and increase risk of developing melanoma, it constitutes one of the best models to understand how natural selection acts on pigmentation. Here we employed a locally developed pipeline to obtain genotype and haplotype data for MC1R from the raw sequencing data provided by the 1000 Genomes FTP site. We also compared such genotype data to Phase 3 VCF to evaluate its quality and discover any polymorphic sites that may have been overlooked. In conclusion, either the VCF file or one of the presently described pipelines could be used to obtain reliable and accurate genotype calling from the 1000 Genomes Phase 3 data.

摘要

新一代测序技术的出现使得能够同时处理多个基因组区域/个体,提高了全基因组数据的可用性和准确性。然而,由于比对、基因型分型和插补方法,这些新方法可能会出现一些错误和偏差。尽管存在这些缺陷,但通过新一代测序获得的数据对于特定基因的群体和进化研究可能具有重要价值,比如与色素沉着在人群中如何进化相关的基因,这是人类进化生物学的主要课题之一。黑皮质素-1受体(MC1R)是研究最多的参与色素沉着变异的基因之一。由于MC1R已被认为会影响黑色素生成并增加患黑色素瘤的风险,它是理解自然选择如何作用于色素沉着的最佳模型之一。在这里,我们采用了本地开发的流程,从千人基因组计划FTP站点提供的原始测序数据中获取MC1R的基因型和单倍型数据。我们还将这些基因型数据与第3阶段的VCF文件进行比较,以评估其质量并发现任何可能被忽视的多态性位点。总之,VCF文件或本文描述的流程之一均可用于从千人基因组计划第3阶段数据中获得可靠且准确的基因型分型。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验