Peng Min-Sheng, He Jun-Dong, Fan Long, Liu Jie, Adeola Adeniyi C, Wu Shi-Fang, Murphy Robert W, Yao Yong-Gang, Zhang Ya-Ping
1] State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China [2] KIZ/CUHK Joint Laboratory of Bioresources and Molecular Research in Common Diseases, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China [3] Youth Innovation Promotion Association, Chinese Academy of Sciences, Beijing, China.
1] State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China [2] KIZ/CUHK Joint Laboratory of Bioresources and Molecular Research in Common Diseases, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China [3] Institute of Clinical and Basic Medical Sciences, First People's Hospital of Yunnan Province (Affiliated Hospital of Kunming University of Science and Technology), Kunming, China.
Eur J Hum Genet. 2014 Aug;22(8):1046-50. doi: 10.1038/ejhg.2013.272. Epub 2013 Nov 27.
Phylogenetically informative Y chromosomal single-nucleotide polymorphisms (Y-SNPs) integrated in DNA chips have not been sufficiently explored in most genome-wide association studies (GWAS). Herein, we introduce a pipeline to retrieve Y-SNP data. We introduce the software YTool (http://mitotool.org/ytool/) to handle conversion, filtering, and annotation of the data. Genome-wide SNP data from populations in Myanmar are used to construct a haplogroup tree for 117 Y chromosomes based on 369 high-confidence Y-SNPs. Parallel genotyping and published resequencing data of Y chromosomes confirm the validity of our pipeline. We apply this strategy to the CEU HapMap data set and construct a haplogroup tree with 107 Y-SNPs from 39 individuals. The retrieved Y-SNPs can discern the parental genetic structure of populations. Given the massive quantity of data from GWAS, this method facilitates future investigations of Y chromosome diversity.
在大多数全基因组关联研究(GWAS)中,整合于DNA芯片中的系统发育信息丰富的Y染色体单核苷酸多态性(Y-SNPs)尚未得到充分探索。在此,我们介绍一种检索Y-SNP数据的流程。我们引入软件YTool(http://mitotool.org/ytool/)来处理数据的转换、过滤和注释。来自缅甸人群的全基因组SNP数据用于基于369个高可信度Y-SNPs构建117条Y染色体的单倍群树。Y染色体的平行基因分型和已发表的重测序数据证实了我们流程的有效性。我们将此策略应用于CEU HapMap数据集,并利用来自39个个体的107个Y-SNPs构建了单倍群树。检索到的Y-SNPs能够辨别群体的亲本遗传结构。鉴于GWAS产生的海量数据,该方法有助于未来对Y染色体多样性的研究。