• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

我们应该盲目相信生物信息学软件吗?基于网络的SNAP工具示例。

Should we have blind faith in bioinformatics software? Illustrations from the SNAP web-based tool.

作者信息

Robiou-du-Pont Sébastien, Li Aihua, Christie Shanice, Sohani Zahra N, Meyre David

机构信息

Department of Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, Ontario, Canada.

Department of Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, Ontario, Canada; Population Health Research Institute, McMaster University and Hamilton Health Sciences, Hamilton General Hospital, Hamilton, Ontario, Canada.

出版信息

PLoS One. 2015 Mar 5;10(3):e0118925. doi: 10.1371/journal.pone.0118925. eCollection 2015.

DOI:10.1371/journal.pone.0118925
PMID:25742008
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4351168/
Abstract

Bioinformatics tools have gained popularity in biology but little is known about their validity. We aimed to assess the early contribution of 415 single nucleotide polymorphisms (SNPs) associated with eight cardio-metabolic traits at the genome-wide significance level in adults in the Family Atherosclerosis Monitoring In earLY Life (FAMILY) birth cohort. We used the popular web-based tool SNAP to assess the availability of the 415 SNPs in the Illumina Cardio-Metabochip genotyped in the FAMILY study participants. We then compared the SNAP output with the Cardio-Metabochip file provided by Illumina using chromosome and chromosomal positions of SNPs from NCBI Human Genome Browser (Genome Reference Consortium Human Build 37). With the HapMap 3 release 2 reference, 201 out of 415 SNPs were reported as missing in the Cardio-Metabochip by the SNAP output. However, the Cardio-Metabochip file revealed that 152 of these 201 SNPs were in fact present in the Cardio-Metabochip array (false negative rate of 36.6%). With the more recent 1000 Genomes Project release, we found a false-negative rate of 17.6% by comparing the outputs of SNAP and the Illumina product file. We did not find any 'false positive' SNPs (SNPs specified as available in the Cardio-Metabochip by SNAP, but not by the Cardio-Metabochip Illumina file). The Cohen's Kappa coefficient, which calculates the percentage of agreement between both methods, indicated that the validity of SNAP was fair to moderate depending on the reference used (the HapMap 3 or 1000 Genomes). In conclusion, we demonstrate that the SNAP outputs for the Cardio-Metabochip are invalid. This study illustrates the importance of systematically assessing the validity of bioinformatics tools in an independent manner. We propose a series of guidelines to improve practices in the fast-moving field of bioinformatics software implementation.

摘要

生物信息学工具在生物学领域已颇受欢迎,但对其有效性却知之甚少。我们旨在评估与八种心脏代谢性状相关的415个单核苷酸多态性(SNP)在全基因组显著水平上对“早年生活家庭动脉粥样硬化监测”(FAMILY)出生队列中的成年人的早期贡献。我们使用了广受欢迎的基于网络的工具SNAP来评估FAMILY研究参与者中经Illumina心脏代谢芯片基因分型的415个SNP的可用性。然后,我们使用来自NCBI人类基因组浏览器(基因组参考联盟人类构建版37)的SNP的染色体和染色体位置,将SNAP输出结果与Illumina提供的心脏代谢芯片文件进行比较。以HapMap 3版本2为参考,SNAP输出结果显示在心脏代谢芯片中415个SNP中有201个缺失。然而,心脏代谢芯片文件显示这201个SNP中有152个实际上存在于心脏代谢芯片阵列中(假阴性率为36.6%)。以更新的千人基因组计划版本为参考,通过比较SNAP输出结果和Illumina产品文件,我们发现假阴性率为17.6%。我们未发现任何“假阳性”SNP(即SNAP指定在心脏代谢芯片中可用,但Illumina心脏代谢芯片文件中未列出的SNP)。计算两种方法之间一致性百分比的科恩卡帕系数表明,根据所使用的参考(HapMap 3或千人基因组),SNAP的有效性为中等。总之,我们证明了心脏代谢芯片的SNAP输出结果是无效的。本研究说明了以独立方式系统评估生物信息学工具有效性的重要性。我们提出了一系列指导方针,以改进生物信息学软件实施这一快速发展领域的实践。

相似文献

1
Should we have blind faith in bioinformatics software? Illustrations from the SNAP web-based tool.我们应该盲目相信生物信息学软件吗?基于网络的SNAP工具示例。
PLoS One. 2015 Mar 5;10(3):e0118925. doi: 10.1371/journal.pone.0118925. eCollection 2015.
2
SNAP: a web-based tool for identification and annotation of proxy SNPs using HapMap.SNAP:一种基于网络的工具,用于使用HapMap识别和注释代理单核苷酸多态性。
Bioinformatics. 2008 Dec 15;24(24):2938-9. doi: 10.1093/bioinformatics/btn564. Epub 2008 Oct 30.
3
SNPdetector: a software tool for sensitive and accurate SNP detection.SNPdetector:一款用于灵敏且准确地检测单核苷酸多态性的软件工具。
PLoS Comput Biol. 2005 Oct;1(5):e53. doi: 10.1371/journal.pcbi.0010053. Epub 2005 Oct 28.
4
Effect of genome-wide genotyping and reference panels on rare variants imputation.全基因组基因分型和参考面板对稀有变异体推断的影响。
J Genet Genomics. 2012 Oct 20;39(10):545-50. doi: 10.1016/j.jgg.2012.07.002. Epub 2012 Jul 24.
5
A web-based tool to retrieve human genome polymorphisms from public databases.一种用于从公共数据库检索人类基因组多态性的基于网络的工具。
Proc AMIA Symp. 2001:558-62.
6
Dynamic model based algorithms for screening and genotyping over 100 K SNPs on oligonucleotide microarrays.基于动态模型的寡核苷酸微阵列上100K以上单核苷酸多态性(SNP)筛选和基因分型算法
Bioinformatics. 2005 May 1;21(9):1958-63. doi: 10.1093/bioinformatics/bti275. Epub 2005 Jan 18.
7
Functional evaluation of genetic variants associated with endometriosis near GREB1.与 GREB1 附近的子宫内膜异位症相关的遗传变异的功能评估。
Hum Reprod. 2015 May;30(5):1263-75. doi: 10.1093/humrep/dev051. Epub 2015 Mar 18.
8
A tool for selecting SNPs for association studies based on observed linkage disequilibrium patterns.一种基于观察到的连锁不平衡模式选择单核苷酸多态性(SNP)用于关联研究的工具。
Pac Symp Biocomput. 2006:487-98.
9
Selecting single-nucleotide polymorphisms for association studies with SNPbrowser software.使用SNPbrowser软件选择用于关联研究的单核苷酸多态性。
Methods Mol Biol. 2007;376:177-93. doi: 10.1007/978-1-59745-389-9_13.
10
SNPselector: a web tool for selecting SNPs for genetic association studies.SNPselector:一种用于选择基因关联研究单核苷酸多态性的网络工具。
Bioinformatics. 2005 Nov 15;21(22):4181-6. doi: 10.1093/bioinformatics/bti682. Epub 2005 Sep 22.

引用本文的文献

1
Influence of depression on genetic predisposition to type 2 diabetes in a multiethnic longitudinal study.在一项多民族纵向研究中,抑郁对 2 型糖尿病遗传易感性的影响。
Sci Rep. 2017 May 9;7(1):1629. doi: 10.1038/s41598-017-01406-y.
2
Genetic contribution to lipid levels in early life based on 158 loci validated in adults: the FAMILY study.基于在成年人中验证的 158 个基因位点对生命早期脂质水平的遗传贡献:FAMILY 研究。
Sci Rep. 2017 Mar 6;7(1):68. doi: 10.1038/s41598-017-00102-1.
3
From big data analysis to personalized medicine for all: challenges and opportunities.从大数据分析到全民个性化医疗:挑战与机遇
BMC Med Genomics. 2015 Jun 27;8:33. doi: 10.1186/s12920-015-0108-y.

本文引用的文献

1
An integrated map of genetic variation from 1,092 human genomes.1092 个人类基因组遗传变异的综合图谱。
Nature. 2012 Nov 1;491(7422):56-65. doi: 10.1038/nature11632.
2
The metabochip, a custom genotyping array for genetic studies of metabolic, cardiovascular, and anthropometric traits.代谢芯片,一种用于代谢、心血管和人体测量特征遗传研究的定制基因分型阵列。
PLoS Genet. 2012;8(8):e1002793. doi: 10.1371/journal.pgen.1002793. Epub 2012 Aug 2.
3
Molecular basis of obesity: current status and future prospects.肥胖的分子基础:现状与未来展望。
Curr Genomics. 2011 May;12(3):154-68. doi: 10.2174/138920211795677921.
4
A toolbox for developing bioinformatics software.生物信息学软件开发工具包。
Brief Bioinform. 2012 Mar;13(2):244-57. doi: 10.1093/bib/bbr035. Epub 2011 Jul 29.
5
Computational science: ...Error.计算科学:……错误。
Nature. 2010 Oct 14;467(7317):775-7. doi: 10.1038/467775a.
6
The Family Atherosclerosis Monitoring In earLY life (FAMILY) study: rationale, design, and baseline data of a study examining the early determinants of atherosclerosis.早期生活家庭动脉粥样硬化监测(FAMILY)研究:一项探究动脉粥样硬化早期决定因素的研究的原理、设计及基线数据
Am Heart J. 2009 Oct;158(4):533-9. doi: 10.1016/j.ahj.2009.07.005. Epub 2009 Aug 28.
7
STrengthening the REporting of Genetic Association Studies (STREGA): an extension of the STROBE statement.加强遗传关联研究报告规范(STREGA):STROBE声明的扩展
PLoS Med. 2009 Feb 3;6(2):e22. doi: 10.1371/journal.pmed.1000022.
8
SNAP: a web-based tool for identification and annotation of proxy SNPs using HapMap.SNAP:一种基于网络的工具,用于使用HapMap识别和注释代理单核苷酸多态性。
Bioinformatics. 2008 Dec 15;24(24):2938-9. doi: 10.1093/bioinformatics/btn564. Epub 2008 Oct 30.
9
A genome-wide association study identifies novel risk loci for type 2 diabetes.一项全基因组关联研究确定了2型糖尿病的新风险位点。
Nature. 2007 Feb 22;445(7130):881-5. doi: 10.1038/nature05616. Epub 2007 Feb 11.
10
GeneCruiser: a web service for the annotation of microarray data.基因巡航者:一个用于微阵列数据注释的网络服务。
Bioinformatics. 2005 Sep 15;21(18):3681-2. doi: 10.1093/bioinformatics/bti587. Epub 2005 Jul 19.