• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

引擎:探索整个人类基因组中的单核苷酸变异。

ENGINES: exploring single nucleotide variation in entire human genomes.

机构信息

Grupo de Medicina Xenómica, CIBERER, Universidade de Santiago de Compostela, Santiago de Compostela, Galicia, Spain.

出版信息

BMC Bioinformatics. 2011 Apr 19;12:105. doi: 10.1186/1471-2105-12-105.

DOI:10.1186/1471-2105-12-105
PMID:21504571
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3107182/
Abstract

BACKGROUND

Next generation ultra-sequencing technologies are starting to produce extensive quantities of data from entire human genome or exome sequences, and therefore new software is needed to present and analyse this vast amount of information. The 1000 Genomes project has recently released raw data for 629 complete genomes representing several human populations through their Phase I interim analysis and, although there are certain public tools available that allow exploration of these genomes, to date there is no tool that permits comprehensive population analysis of the variation catalogued by such data.

DESCRIPTION

We have developed a genetic variant site explorer able to retrieve data for Single Nucleotide Variation (SNVs), population by population, from entire genomes without compromising future scalability and agility. ENGINES (ENtire Genome INterface for Exploring SNVs) uses data from the 1000 Genomes Phase I to demonstrate its capacity to handle large amounts of genetic variation (>7.3 billion genotypes and 28 million SNVs), as well as deriving summary statistics of interest for medical and population genetics applications. The whole dataset is pre-processed and summarized into a data mart accessible through a web interface. The query system allows the combination and comparison of each available population sample, while searching by rs-number list, chromosome region, or genes of interest. Frequency and FST filters are available to further refine queries, while results can be visually compared with other large-scale Single Nucleotide Polymorphism (SNP) repositories such as HapMap or Perlegen.

CONCLUSIONS

ENGINES is capable of accessing large-scale variation data repositories in a fast and comprehensive manner. It allows quick browsing of whole genome variation, while providing statistical information for each variant site such as allele frequency, heterozygosity or FST values for genetic differentiation. Access to the data mart generating scripts and to the web interface is granted from http://spsmart.cesga.es/engines.php.

摘要

背景

下一代超测序技术开始从整个人类基因组或外显子序列中产生大量数据,因此需要新的软件来呈现和分析这些大量信息。1000 基因组计划最近发布了通过其第一阶段中期分析代表几个人类群体的 629 个完整基因组的原始数据,尽管有某些公共工具可用于探索这些基因组,但迄今为止,还没有工具可以允许对这些数据中记录的变异进行全面的群体分析。

描述

我们开发了一种遗传变异位点探索器,能够从整个基因组中逐个人群检索单核苷酸变异(SNV)的数据,而不会影响未来的可扩展性和灵活性。ENGINES(用于探索 SNV 的整个基因组接口)使用 1000 基因组计划第一阶段的数据来证明其处理大量遗传变异(>73 亿基因型和 2800 万个 SNV)的能力,以及得出对医学和群体遗传学应用感兴趣的摘要统计信息。整个数据集经过预处理并汇总到一个数据集市中,可通过 Web 界面访问。查询系统允许对每个可用的群体样本进行组合和比较,同时可以通过 rs 编号列表、染色体区域或感兴趣的基因进行搜索。可用频率和 FST 过滤器进一步细化查询,而结果可以与其他大规模单核苷酸多态性(SNP)存储库(如 HapMap 或 Perlegen)进行可视化比较。

结论

ENGINES 能够快速全面地访问大规模变异数据存储库。它允许快速浏览整个基因组的变异,同时为每个变异位点提供统计信息,例如等位基因频率、杂合度或遗传分化的 FST 值。从 http://spsmart.cesga.es/engines.php 可以访问生成数据集市的脚本和 Web 界面。

相似文献

1
ENGINES: exploring single nucleotide variation in entire human genomes.引擎:探索整个人类基因组中的单核苷酸变异。
BMC Bioinformatics. 2011 Apr 19;12:105. doi: 10.1186/1471-2105-12-105.
2
SPSmart: adapting population based SNP genotype databases for fast and comprehensive web access.SPSmart:使基于群体的单核苷酸多态性(SNP)基因型数据库适用于快速全面的网络访问。
BMC Bioinformatics. 2008 Oct 10;9:428. doi: 10.1186/1471-2105-9-428.
3
Viability of in-house datamarting approaches for population genetics analysis of SNP genotypes.内部数据集市方法用于SNP基因型群体遗传学分析的可行性。
BMC Bioinformatics. 2009 Mar 19;10 Suppl 3(Suppl 3):S5. doi: 10.1186/1471-2105-10-S3-S5.
4
SNP@Evolution: a hierarchical database of positive selection on the human genome.SNP@进化:人类基因组正向选择的分层数据库。
BMC Evol Biol. 2009 Sep 5;9:221. doi: 10.1186/1471-2148-9-221.
5
A comparison of cataloged variation between International HapMap Consortium and 1000 Genomes Project data.国际人类基因组单体型图计划与 1000 基因组计划数据中已编目的变异比较。
J Am Med Inform Assoc. 2012 Mar-Apr;19(2):289-94. doi: 10.1136/amiajnl-2011-000652.
6
Whole genome sequencing of 35 individuals provides insights into the genetic architecture of Korean population.对35名个体进行的全基因组测序为了解韩国人群的遗传结构提供了线索。
BMC Bioinformatics. 2014;15 Suppl 11(Suppl 11):S6. doi: 10.1186/1471-2105-15-S11-S6. Epub 2014 Oct 21.
7
SeqWare Query Engine: storing and searching sequence data in the cloud.SeqWare 查询引擎:在云端存储和搜索序列数据。
BMC Bioinformatics. 2010 Dec 21;11 Suppl 12(Suppl 12):S2. doi: 10.1186/1471-2105-11-S12-S2.
8
IndiGenomes: a comprehensive resource of genetic variants from over 1000 Indian genomes.IndiGenomes:一个包含超过 1000 个印度基因组遗传变异的综合资源。
Nucleic Acids Res. 2021 Jan 8;49(D1):D1225-D1232. doi: 10.1093/nar/gkaa923.
9
GrabBlur--a framework to facilitate the secure exchange of whole-exome and -genome SNV data using VCF files.GrabBlur——一个使用VCF文件促进全外显子组和基因组单核苷酸变异(SNV)数据安全交换的框架。
BMC Genomics. 2014;15 Suppl 4(Suppl 4):S8. doi: 10.1186/1471-2164-15-S4-S8. Epub 2014 May 20.
10
Evaluation of whole exome sequencing as an alternative to BeadChip and whole genome sequencing in human population genetic analysis.评估全外显子组测序作为人类群体遗传分析中替代 BeadChip 和全基因组测序的方法。
BMC Genomics. 2018 Oct 29;19(1):778. doi: 10.1186/s12864-018-5168-x.

引用本文的文献

1
Whole Exome Sequencing Identifies New Host Genomic Susceptibility Factors in Empyema Caused by Streptococcus pneumoniae in Children: A Pilot Study.全外显子组测序鉴定儿童肺炎链球菌所致脓胸新的宿主基因组易感性因素:一项初步研究
Genes (Basel). 2018 May 3;9(5):240. doi: 10.3390/genes9050240.
2
Whole Exome Sequencing reveals new candidate genes in host genomic susceptibility to Respiratory Syncytial Virus Disease.全外显子组测序揭示了宿主基因组对呼吸道合胞病毒病易感性的新候选基因。
Sci Rep. 2017 Nov 21;7(1):15888. doi: 10.1038/s41598-017-15752-4.
3
An SNP panel for the analysis of paternally inherited alleles in maternal plasma using ion Torrent PGM.

本文引用的文献

1
A map of human genome variation from population-scale sequencing.人类基因组变异的图谱来自于基于人群的测序。
Nature. 2010 Oct 28;467(7319):1061-73. doi: 10.1038/nature09534.
2
Viability of in-house datamarting approaches for population genetics analysis of SNP genotypes.内部数据集市方法用于SNP基因型群体遗传学分析的可行性。
BMC Bioinformatics. 2009 Mar 19;10 Suppl 3(Suppl 3):S5. doi: 10.1186/1471-2105-10-S3-S5.
3
Arlequin (version 3.0): an integrated software package for population genetics data analysis.Arlequin(版本 3.0):一个用于群体遗传学数据分析的集成软件包。
一种用于使用离子激流PGM分析母血中父系遗传等位基因的单核苷酸多态性(SNP)检测板。
Int J Legal Med. 2018 Mar;132(2):343-352. doi: 10.1007/s00414-017-1594-6. Epub 2017 Apr 20.
4
Natural resistance to Meningococcal Disease related to CFH loci: Meta-analysis of genome-wide association studies.天然抵抗脑膜炎球菌病与 CFH 基因座相关:全基因组关联研究的荟萃分析。
Sci Rep. 2016 Nov 2;6:35842. doi: 10.1038/srep35842.
5
Age-associated DNA methylation changes in immune genes, histone modifiers and chromatin remodeling factors within 5 years after birth in human blood leukocytes.人类血液白细胞出生后5年内免疫基因、组蛋白修饰因子和染色质重塑因子中与年龄相关的DNA甲基化变化。
Clin Epigenetics. 2015 Mar 26;7(1):34. doi: 10.1186/s13148-015-0064-6. eCollection 2015.
6
Mitogenomes from The 1000 Genome Project reveal new Near Eastern features in present-day Tuscans.来自千人基因组计划的线粒体基因组揭示了现代托斯卡纳人新的近东特征。
PLoS One. 2015 Mar 18;10(3):e0119242. doi: 10.1371/journal.pone.0119242. eCollection 2015.
7
The cyclic AMP pathway is a sex-specific modifier of glioma risk in type I neurofibromatosis patients.环磷酸腺苷(cAMP)信号通路是I型神经纤维瘤病患者患胶质瘤风险的性别特异性调节因子。
Cancer Res. 2015 Jan 1;75(1):16-21. doi: 10.1158/0008-5472.CAN-14-1891. Epub 2014 Nov 7.
8
A genome-wide study of modern-day Tuscans: revisiting Herodotus's theory on the origin of the Etruscans.一项对现代托斯卡纳人的全基因组研究:重新审视希罗多德关于伊特鲁里亚人起源的理论。
PLoS One. 2014 Sep 17;9(9):e105920. doi: 10.1371/journal.pone.0105920. eCollection 2014.
9
Evaluating the accuracy of AIM panels at quantifying genome ancestry.评估AIM面板在量化基因组血统方面的准确性。
BMC Genomics. 2014 Jun 30;15(1):543. doi: 10.1186/1471-2164-15-543.
10
SInC: an accurate and fast error-model based simulator for SNPs, Indels and CNVs coupled with a read generator for short-read sequence data.SInC:一种准确且快速的基于错误模型的 SNP、Indel 和 CNV 模拟器,结合了用于短读序列数据的读取生成器。
BMC Bioinformatics. 2014 Feb 5;15:40. doi: 10.1186/1471-2105-15-40.
Evol Bioinform Online. 2007 Feb 23;1:47-50.
4
SPSmart: adapting population based SNP genotype databases for fast and comprehensive web access.SPSmart:使基于群体的单核苷酸多态性(SNP)基因型数据库适用于快速全面的网络访问。
BMC Bioinformatics. 2008 Oct 10;9:428. doi: 10.1186/1471-2105-9-428.
5
Worldwide human relationships inferred from genome-wide patterns of variation.从全基因组变异模式推断全球人类关系。
Science. 2008 Feb 22;319(5866):1100-4. doi: 10.1126/science.1153717.
6
A haplotype map of the human genome.人类基因组单倍型图谱。
Nature. 2005 Oct 27;437(7063):1299-320. doi: 10.1038/nature04226.
7
Perlegen sciences, inc.珀勒根科学公司
Pharmacogenomics. 2005 Jun;6(4):439-42. doi: 10.1517/14622416.6.4.439.
8
Interrogating a high-density SNP map for signatures of natural selection.审视高密度单核苷酸多态性图谱以寻找自然选择的特征。
Genome Res. 2002 Dec;12(12):1805-14. doi: 10.1101/gr.631202.
9
Inference of population structure using multilocus genotype data.利用多位点基因型数据推断群体结构。
Genetics. 2000 Jun;155(2):945-59. doi: 10.1093/genetics/155.2.945.
10
Distribution of gene frequency as a test of the theory of the selective neutrality of polymorphisms.基因频率分布作为对多态性选择中性理论的一种检验。
Genetics. 1973 May;74(1):175-95. doi: 10.1093/genetics/74.1.175.