基于 Benjamini-Hochberg 算法的自动峰选择。

Automatic peak selection by a Benjamini-Hochberg-based algorithm.

机构信息

Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia.

出版信息

PLoS One. 2013;8(1):e53112. doi: 10.1371/journal.pone.0053112. Epub 2013 Jan 7.

DOI:10.1371/journal.pone.0053112

PMID:23308147

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3538655/

Abstract

A common issue in bioinformatics is that computational methods often generate a large number of predictions sorted according to certain confidence scores. A key problem is then determining how many predictions must be selected to include most of the true predictions while maintaining reasonably high precision. In nuclear magnetic resonance (NMR)-based protein structure determination, for instance, computational peak picking methods are becoming more and more common, although expert-knowledge remains the method of choice to determine how many peaks among thousands of candidate peaks should be taken into consideration to capture the true peaks. Here, we propose a Benjamini-Hochberg (B-H)-based approach that automatically selects the number of peaks. We formulate the peak selection problem as a multiple testing problem. Given a candidate peak list sorted by either volumes or intensities, we first convert the peaks into [Formula: see text]-values and then apply the B-H-based algorithm to automatically select the number of peaks. The proposed approach is tested on the state-of-the-art peak picking methods, including WaVPeak [1] and PICKY [2]. Compared with the traditional fixed number-based approach, our approach returns significantly more true peaks. For instance, by combining WaVPeak or PICKY with the proposed method, the missing peak rates are on average reduced by 20% and 26%, respectively, in a benchmark set of 32 spectra extracted from eight proteins. The consensus of the B-H-selected peaks from both WaVPeak and PICKY achieves 88% recall and 83% precision, which significantly outperforms each individual method and the consensus method without using the B-H algorithm. The proposed method can be used as a standard procedure for any peak picking method and straightforwardly applied to some other prediction selection problems in bioinformatics. The source code, documentation and example data of the proposed method is available at http://sfb.kaust.edu.sa/pages/software.aspx.

摘要

生物信息学中的一个常见问题是，计算方法通常会生成大量根据某些置信分数排序的预测。然后，关键问题是确定必须选择多少预测才能包含大多数真实预测，同时保持合理的高精度。例如，在基于核磁共振（NMR）的蛋白质结构测定中，计算峰提取方法变得越来越普遍，尽管专家知识仍然是选择确定在数千个候选峰中应该考虑多少峰以捕获真实峰的首选方法。在这里，我们提出了一种基于 Benjamini-Hochberg（B-H）的方法来自动选择峰的数量。我们将峰选择问题表述为一个多重检验问题。给定一个按体积或强度排序的候选峰列表，我们首先将峰转换为[Formula: see text]-值，然后应用 B-H 算法自动选择峰的数量。所提出的方法在最先进的峰提取方法上进行了测试，包括 WaVPeak[1]和 PICKY[2]。与传统的基于固定数量的方法相比，我们的方法返回了更多的真实峰。例如，通过将 WaVPeak 或 PICKY 与所提出的方法结合使用，在从八个蛋白质中提取的 32 个光谱的基准集中，缺失峰的比率平均降低了 20%和 26%。来自 WaVPeak 和 PICKY 的 B-H 选择峰的共识分别达到 88%的召回率和 83%的精度，明显优于每个单独的方法和不使用 B-H 算法的共识方法。所提出的方法可以用作任何峰提取方法的标准程序，并可以直接应用于生物信息学中的一些其他预测选择问题。该方法的源代码、文档和示例数据可在 http://sfb.kaust.edu.sa/pages/software.aspx 获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe68/3538655/89a770f72483/pone.0053112.g001.jpg

相似文献

Automatic peak selection by a Benjamini-Hochberg-based algorithm.基于 Benjamini-Hochberg 算法的自动峰选择。

PLoS One. 2013;8(1):e53112. doi: 10.1371/journal.pone.0053112. Epub 2013 Jan 7.

WaVPeak: picking NMR peaks through wavelet-based smoothing and volume-based filtering.WaVPeak：基于小波平滑和基于体积的滤波的 NMR 峰提取。

Bioinformatics. 2012 Apr 1;28(7):914-20. doi: 10.1093/bioinformatics/bts078. Epub 2012 Feb 10.

PICKY: a novel SVD-based NMR spectra peak picking method.PICKY：一种基于奇异值分解的新型核磁共振谱峰挑选方法。

Bioinformatics. 2009 Jun 15;25(12):i268-75. doi: 10.1093/bioinformatics/btp225.

An automated framework for NMR resonance assignment through simultaneous slice picking and spin system forming.一种通过同时进行切片选取和自旋系统形成来实现核磁共振共振归属的自动化框架。

J Biomol NMR. 2014 Jun;59(2):75-86. doi: 10.1007/s10858-014-9828-0. Epub 2014 Apr 19.

Computer vision-based automated peak picking applied to protein NMR spectra.应用于蛋白质核磁共振谱的基于计算机视觉的自动峰挑选

Bioinformatics. 2015 Sep 15;31(18):2981-8. doi: 10.1093/bioinformatics/btv318. Epub 2015 May 20.

Peak picking NMR spectral data using non-negative matrix factorization.使用非负矩阵分解进行 NMR 光谱数据的峰提取。

BMC Bioinformatics. 2014 Feb 11;15:46. doi: 10.1186/1471-2105-15-46.

Bayesian peak picking for NMR spectra.贝叶斯峰提取用于 NMR 谱。

Genomics Proteomics Bioinformatics. 2014 Feb;12(1):39-47. doi: 10.1016/j.gpb.2013.07.003. Epub 2013 Oct 31.

Resonance assignment of the NMR spectra of disordered proteins using a multi-objective non-dominated sorting genetic algorithm.使用多目标非支配排序遗传算法对无序蛋白质的 NMR 谱进行共振分配。

J Biomol NMR. 2013 Nov;57(3):281-96. doi: 10.1007/s10858-013-9788-9. Epub 2013 Oct 17.

Protein NMR structure determination with automated NOE assignment using the new software CANDID and the torsion angle dynamics algorithm DYANA.使用新软件CANDID和扭转角动力学算法DYANA进行自动NOE归属的蛋白质核磁共振结构测定。

J Mol Biol. 2002 May 24;319(1):209-27. doi: 10.1016/s0022-2836(02)00241-3.

A general algorithm for peak-tracking in multi-dimensional NMR experiments.一种用于多维核磁共振实验中峰跟踪的通用算法。

J Biomol NMR. 2007 Apr;37(4):265-75. doi: 10.1007/s10858-006-9136-4. Epub 2007 Feb 10.

引用本文的文献

Analysis of Intestinal Bacterial Microbiota in Individuals with and without Chronic Low Back Pain.慢性下背痛患者与非慢性下背痛患者的肠道细菌微生物群分析。

Curr Issues Mol Biol. 2024 Jul 12;46(7):7339-7352. doi: 10.3390/cimb46070435.

Potential Molecular Mechanisms and Remdesivir Treatment for Acute Respiratory Syndrome Corona Virus 2 Infection/COVID 19 Through RNA Sequencing and Bioinformatics Analysis.通过RNA测序和生物信息学分析探讨急性呼吸综合征冠状病毒2感染/新冠肺炎的潜在分子机制及瑞德西韦治疗

Bioinform Biol Insights. 2021 Dec 23;15:11779322211067365. doi: 10.1177/11779322211067365. eCollection 2021.

Identification of Key Pathways and Genes in Obesity Using Bioinformatics Analysis and Molecular Docking Studies.利用生物信息学分析和分子对接研究鉴定肥胖相关的关键通路和基因。

Front Endocrinol (Lausanne). 2021 Jun 24;12:628907. doi: 10.3389/fendo.2021.628907. eCollection 2021.

Leveraging 16S rRNA Microbiome Sequencing Data to Identify Bacterial Signatures for Irritable Bowel Syndrome.利用 16S rRNA 微生物组测序数据识别肠易激综合征的细菌特征。

Front Cell Infect Microbiol. 2021 Jun 11;11:645951. doi: 10.3389/fcimb.2021.645951. eCollection 2021.

A novel prognostic model based on multi-omics features predicts the prognosis of colon cancer patients.一种基于多组学特征的新型预后模型预测结肠癌患者的预后。

Mol Genet Genomic Med. 2020 Jul;8(7):e1255. doi: 10.1002/mgg3.1255. Epub 2020 May 12.

Prioritization and comprehensive analysis of genes associated with melanoma.与黑色素瘤相关基因的优先级排序及综合分析。

Oncol Lett. 2019 Jul;18(1):127-136. doi: 10.3892/ol.2019.10284. Epub 2019 Apr 25.

A Swath Label-Free Proteomics insight into the Faah Mouse Liver.FAIR 标签免费蛋白质组学揭示 FAH 小鼠肝脏。

Sci Rep. 2018 Aug 14;8(1):12142. doi: 10.1038/s41598-018-30553-z.

A fast fiducial marker tracking model for fully automatic alignment in electron tomography.一种快速的特征标记跟踪模型，用于电子断层扫描中的全自动配准。

Bioinformatics. 2018 Mar 1;34(5):853-863. doi: 10.1093/bioinformatics/btx653.

Automation of peak-tracking analysis of stepwise perturbed NMR spectra.逐步扰动核磁共振谱的峰跟踪分析自动化

J Biomol NMR. 2017 Feb;67(2):121-134. doi: 10.1007/s10858-017-0088-7. Epub 2017 Feb 17.

Median Modified Wiener Filter for nonlinear adaptive spatial denoising of protein NMR multidimensional spectra.用于蛋白质核磁共振多维谱非线性自适应空间去噪的中位数修正维纳滤波器

Sci Rep. 2015 Jan 26;5:8017. doi: 10.1038/srep08017.

本文引用的文献

Protein domain recurrence and order can enhance prediction of protein functions.蛋白质结构域的重复和顺序可以增强对蛋白质功能的预测。

Bioinformatics. 2012 Sep 15;28(18):i444-i450. doi: 10.1093/bioinformatics/bts398.

A general Bayesian method for an automated signal class recognition in 2D NMR spectra combined with a multivariate discriminant analysis.一种二维 NMR 谱中自动信号分类的通用贝叶斯方法，结合了多元判别分析。

J Biomol NMR. 1995 Apr;5(3):287-96. doi: 10.1007/BF00211755.

NMR View: A computer program for the visualization and analysis of NMR data.NMR 视图：用于可视化和分析 NMR 数据的计算机程序。

J Biomol NMR. 1994 Sep;4(5):603-14. doi: 10.1007/BF00404272.

Bayesian signal extraction from noisy FT NMR spectra.从噪声 FT NMR 光谱中进行贝叶斯信号提取。

J Biomol NMR. 1994 Jul;4(4):505-18. doi: 10.1007/BF00156617.

WaVPeak: picking NMR peaks through wavelet-based smoothing and volume-based filtering.WaVPeak：基于小波平滑和基于体积的滤波的 NMR 峰提取。

Bioinformatics. 2012 Apr 1;28(7):914-20. doi: 10.1093/bioinformatics/bts078. Epub 2012 Feb 10.

A common sense approach to peak picking in two-, three-, and four-dimensional spectra using automatic computer analysis of contour diagrams. 1991.一种使用等高线图自动计算机分析在二维、三维和四维光谱中进行峰挑选的常识性方法。1991年。

J Magn Reson. 2011 Dec;213(2):357-63. doi: 10.1016/j.jmr.2011.09.007.

Error tolerant NMR backbone resonance assignment and automated structure generation.容错核磁共振主链共振归属及自动结构生成

J Bioinform Comput Biol. 2011 Feb;9(1):15-41. doi: 10.1142/s0219720011005276.

PICKY: a novel SVD-based NMR spectra peak picking method.PICKY：一种基于奇异值分解的新型核磁共振谱峰挑选方法。

Bioinformatics. 2009 Jun 15;25(12):i268-75. doi: 10.1093/bioinformatics/btp225.

Automated structure determination from NMR spectra.通过核磁共振光谱进行自动结构测定。

Eur Biophys J. 2009 Feb;38(2):129-43. doi: 10.1007/s00249-008-0367-z. Epub 2008 Sep 20.

Predicting protein function from domain content.从结构域组成预测蛋白质功能。

Bioinformatics. 2008 Aug 1;24(15):1681-7. doi: 10.1093/bioinformatics/btn312. Epub 2008 Jun 30.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于 Benjamini-Hochberg 算法的自动峰选择。

Automatic peak selection by a Benjamini-Hochberg-based algorithm.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献