• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

首批哈萨克人全基因组:NGS数据的首次报告。

The First Kazakh Whole Genomes: The First Report of NGS Data.

作者信息

Akilzhanova Ainur, Kairov Ulykbek, Rakhimova Saule, Molkenov Askhat, Rhie Arang, Kim Jong-Il, Seo Jeong-Sun, Zhumadilov Zhaxybay

机构信息

Center for Life Sciences, Nazarbayev University, Astana, Kazakhstan.

Department of Biochemistry and Molecular Biology, Genomic Medicine Institute, Seoul National University College of Medicine, South Korea.

出版信息

Cent Asian J Glob Health. 2014 Dec 12;3(Suppl):146. doi: 10.5195/cajgh.2014.146. eCollection 2014.

DOI:10.5195/cajgh.2014.146
PMID:29805883
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5960922/
Abstract

INTRODUCTION

The human genome sequence will underpin human biology and medicine in the next century, providing a single, essential reference to all genetic information. Extraordinary technological advances and decreases in the cost of DNA sequencing have made the possibility of whole genome sequencing (WGS) feasible as a highly accessible test for numerous indications. The international project "Genetic architecture of Kazakh population" is well underway to determine the complete DNA. Next generation sequencing is a powerful tool for genetic analysis, which will enable us to uncover the association of loci at specific sites in the genome associated with disease. The aim of this study was to introduce first data on WGS of 6 Kazakh individuals.

METHODS

This pilot study is among the first WGS performed on 6 healthy Kazakh individuals, using next generation sequencing platform HiSeq2000, Illumina by manufacturer's protocols. All generated *.bcl files were simultaneously converted and demultiplexed using bcl2fasta application. Alignment of sequence reads performed using bwa-mem against human b19 reference genome. Sorting, removing of intermediate files, *.bam files assembling, and marking duplicates were performed using PicardTools package. GATK haplotype caller tool was used for variant calling. ClinVar, SNPedia, and Cosmic databases were processed to identify clinical genomic variants in 6 Kazakh whole genomes. Java Runtime Environment and R. Bioconductor packages were installed to perform raw data processing and run program scripts.

RESULTS

The sequence alignment and mapping procedures on reference genome hg19 of each 6 healthy Kazakh individual were completed. Between 87,308,581,400 and 107,526,741,301 total base pairs were sequenced with average coverage x29.85. Between 98.85% and 99.58% base pairs were totally mapped and on average 96.07% were properly paired. Het/Hom and Ti/Tv ratios for each whole genome ranged from 1.35 to 1.52 and from 2.07 to 2.08, respectively. We compared and analyzed each genome with on existing clinical databases ClinVar, SNPedia, Cosmic and found from 20 to 25, from 269 to 288, from 7 to 12 SNP records, respectively. The availability of a reference Kazakh genome sequences provides the basis for studying the nature of sequence variation, particularly single nucleotide polymorphisms.

CONCLUSION

The first whole genome sequencing of Kazakhs were performed. In this pilot study, we identified SNPs associated with different conditions. Further studies of WGS on Kazakh population are needed to identify possible unique genetic variants in Kazakhs.

摘要

引言

人类基因组序列将成为下个世纪人类生物学和医学的基础,为所有遗传信息提供唯一重要的参考。DNA测序技术的非凡进步以及成本的降低,使得全基因组测序(WGS)作为一种适用于多种适应症的高度可及检测方法成为可能。“哈萨克族人群遗传结构”国际项目正在顺利进行,以确定完整的DNA。新一代测序是遗传分析的有力工具,它将使我们能够揭示基因组中与疾病相关的特定位点的基因座关联。本研究的目的是介绍6名哈萨克族人全基因组测序的首批数据。

方法

这项试点研究是对6名健康哈萨克族人进行的首批全基因组测序之一,使用制造商协议的Illumina HiSeq2000新一代测序平台。所有生成的*.bcl文件使用bcl2fasta应用程序同时进行转换和解复用。使用bwa-mem将序列读数与人b19参考基因组进行比对。使用PicardTools软件包进行排序、去除中间文件、*.bam文件组装和标记重复项。使用GATK单倍型分型工具进行变异检测。对ClinVar、SNPedia和Cosmic数据库进行处理,以识别6名哈萨克族人全基因组中的临床基因组变异。安装Java运行时环境和R. Bioconductor软件包以进行原始数据处理并运行程序脚本。

结果

完成了6名健康哈萨克族人中每个人的参考基因组hg19的序列比对和映射程序。共测序87,308,581,400至107,526,741,301个碱基对,平均覆盖度为x29.85。98.85%至99.58%的碱基对被完全映射,平均96.07%正确配对。每个全基因组的Het/Hom和Ti/Tv比率分别为1.35至1.52和2.07至2.08。我们将每个基因组与现有的临床数据库ClinVar、SNPedia、Cosmic进行比较和分析,分别发现20至25条、269至288条、7至12条单核苷酸多态性(SNP)记录。哈萨克族参考基因组序列的可用性为研究序列变异的性质,特别是单核苷酸多态性提供了基础。

结论

完成了哈萨克族人的首次全基因组测序。在这项试点研究中,我们鉴定了与不同病症相关的单核苷酸多态性。需要对哈萨克族人群进行进一步的全基因组测序研究,以确定哈萨克族人中可能存在的独特遗传变异。

相似文献

1
The First Kazakh Whole Genomes: The First Report of NGS Data.首批哈萨克人全基因组:NGS数据的首次报告。
Cent Asian J Glob Health. 2014 Dec 12;3(Suppl):146. doi: 10.5195/cajgh.2014.146. eCollection 2014.
2
Whole-Genome Sequencing and Genomic Variant Analysis of Kazakh Individuals.哈萨克族个体的全基因组测序与基因组变异分析
Front Genet. 2022 Jul 11;13:902804. doi: 10.3389/fgene.2022.902804. eCollection 2022.
3
Whole-genome sequencing data of Kazakh individuals.哈萨克族个体的全基因组测序数据。
BMC Res Notes. 2021 Feb 4;14(1):45. doi: 10.1186/s13104-021-05464-4.
4
Evaluation of whole-genome sequencing of four Chinese crested dogs for variant detection using the ion proton system.使用离子质子系统对四只中国冠毛犬进行全基因组测序以检测变异的评估。
Canine Genet Epidemiol. 2015 Oct 8;2:16. doi: 10.1186/s40575-015-0029-2. eCollection 2015.
5
SPANDx: a genomics pipeline for comparative analysis of large haploid whole genome re-sequencing datasets.SPANDx:一种用于大型单倍体全基因组重测序数据集比较分析的基因组学流程。
BMC Res Notes. 2014 Sep 8;7:618. doi: 10.1186/1756-0500-7-618.
6
Assessing single nucleotide variant detection and genotype calling on whole-genome sequenced individuals.评估全基因组测序个体中单核苷酸变异检测和基因型调用。
Bioinformatics. 2014 Jun 15;30(12):1707-13. doi: 10.1093/bioinformatics/btu067. Epub 2014 Feb 19.
7
The Human Genome Project--an overview.人类基因组计划——概述
Med Res Rev. 2000 May;20(3):189-96. doi: 10.1002/(sici)1098-1128(200005)20:3<189::aid-med2>3.0.co;2-#.
8
Biostatistical Aspects of Whole Genome Sequencing Studies: Preprocessing and Quality Control.全基因组测序研究的生物统计学方面:预处理和质量控制。
Biom J. 2024 Jul;66(5):e202300278. doi: 10.1002/bimj.202300278.
9
An optimized genomic VCF workflow for precise identification of Mycobacterium tuberculosis cluster from cross-platform whole genome sequencing data.一种优化的基因组 VCF 工作流程,用于从跨平台全基因组测序数据中精确鉴定结核分枝杆菌簇。
Infect Genet Evol. 2020 Apr;79:104152. doi: 10.1016/j.meegid.2019.104152. Epub 2019 Dec 24.
10
Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence.基于注释的全基因组 SNP 发现利用下一代测序技术在没有参考基因组序列的情况下在大型复杂的粗山羊草基因组中
BMC Genomics. 2011 Jan 25;12:59. doi: 10.1186/1471-2164-12-59.

引用本文的文献

1
Whole-Genome Sequencing Among Kazakhstani Children with Early-Onset Epilepsy Revealed New Gene Variants and Phenotypic Variability.全基因组测序揭示哈萨克斯坦早发性癫痫儿童的新基因突变和表型变异性。
Mol Neurobiol. 2023 Aug;60(8):4324-4335. doi: 10.1007/s12035-023-03346-3. Epub 2023 Apr 24.