基于学习和插补的质谱偏倚降低方法（LIMBR）。

Learning and Imputation for Mass-spec Bias Reduction (LIMBR).

机构信息

Department of Molecular and Systems Biology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA.

Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.

出版信息

Bioinformatics. 2019 May 1;35(9):1518-1526. doi: 10.1093/bioinformatics/bty828.

DOI:10.1093/bioinformatics/bty828

PMID:30247517

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6499252/

Abstract

MOTIVATION

Decreasing costs are making it feasible to perform time series proteomics and genomics experiments with more replicates and higher resolution than ever before. With more replicates and time points, proteome and genome-wide patterns of expression are more readily discernible. These larger experiments require more batches exacerbating batch effects and increasing the number of bias trends. In the case of proteomics, where methods frequently result in missing data this increasing scale is also decreasing the number of peptides observed in all samples. The sources of batch effects and missing data are incompletely understood necessitating novel techniques.

RESULTS

Here we show that by exploiting the structure of time series experiments, it is possible to accurately and reproducibly model and remove batch effects. We implement Learning and Imputation for Mass-spec Bias Reduction (LIMBR) software, which builds on previous block-based models of batch effects and includes features specific to time series and circadian studies. To aid in the analysis of time series proteomics experiments, which are often plagued with missing data points, we also integrate an imputation system. By building LIMBR for imputation and time series tailored bias modeling into one straightforward software package, we expect that the quality and ease of large-scale proteomics and genomics time series experiments will be significantly increased.

AVAILABILITY AND IMPLEMENTATION

Python code and documentation is available for download at https://github.com/aleccrowell/LIMBR and LIMBR can be downloaded and installed with dependencies using 'pip install limbr'.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

成本的降低使得进行时间序列蛋白质组学和基因组学实验成为可能，这些实验的重复次数和分辨率都比以往任何时候都要高。有了更多的重复和时间点，蛋白质组和全基因组的表达模式就更容易识别。这些更大规模的实验需要更多的批次，从而加剧了批次效应，增加了偏倚趋势的数量。在蛋白质组学中，由于方法经常导致数据缺失，因此这种规模的增加也减少了所有样本中观察到的肽的数量。批次效应和数据缺失的来源尚未完全了解，这需要新的技术。

结果

在这里，我们展示了通过利用时间序列实验的结构，可以准确地、可重复地模拟和去除批次效应。我们实现了学习和推断用于质谱偏倚减少的方法（Learning and Imputation for Mass-spec Bias Reduction，LIMBR）软件，该软件基于以前的基于块的批次效应模型，并包括针对时间序列和昼夜节律研究的特定功能。为了帮助分析通常存在大量数据缺失点的时间序列蛋白质组学实验，我们还集成了一个推断系统。通过将 LIMBR 用于推断和时间序列定制的偏倚建模构建到一个简单的软件包中，我们期望大规模蛋白质组学和基因组学时间序列实验的质量和易用性将得到显著提高。

可用性和实现

Python 代码和文档可在 https://github.com/aleccrowell/LIMBR 上下载，并且可以使用 'pip install limbr' 下载并安装带有依赖项的 LIMBR。

补充信息

补充数据可在 Bioinformatics 在线获取。

相似文献

Learning and Imputation for Mass-spec Bias Reduction (LIMBR).

Bioinformatics. 2019 May 1;35(9):1518-1526. doi: 10.1093/bioinformatics/bty828.

genipe: an automated genome-wide imputation pipeline with automatic reporting and statistical tools.

Bioinformatics. 2016 Dec 1;32(23):3661-3663. doi: 10.1093/bioinformatics/btw487. Epub 2016 Aug 6.

SanXoT: a modular and versatile package for the quantitative analysis of high-throughput proteomics experiments.

Bioinformatics. 2019 May 1;35(9):1594-1596. doi: 10.1093/bioinformatics/bty815.

Molgenis-impute: imputation pipeline in a box.

BMC Res Notes. 2015 Aug 19;8:359. doi: 10.1186/s13104-015-1309-3.

Goldilocks: a tool for identifying genomic regions that are 'just right'.

Bioinformatics. 2016 Jul 1;32(13):2047-9. doi: 10.1093/bioinformatics/btw116. Epub 2016 Mar 7.

Gimpute: an efficient genetic data imputation pipeline.

Bioinformatics. 2019 Apr 15;35(8):1433-1435. doi: 10.1093/bioinformatics/bty814.

Robustifying genomic classifiers to batch effects via ensemble learning.

Bioinformatics. 2021 Jul 12;37(11):1521-1527. doi: 10.1093/bioinformatics/btaa986.

TCGA-assembler 2: software pipeline for retrieval and processing of TCGA/CPTAC data.

Bioinformatics. 2018 May 1;34(9):1615-1617. doi: 10.1093/bioinformatics/btx812.

NAguideR: performing and prioritizing missing value imputations for consistent bottom-up proteomic analyses.

Nucleic Acids Res. 2020 Aug 20;48(14):e83. doi: 10.1093/nar/gkaa498.

IMMerge: merging imputation data at scale.

Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac750.

引用本文的文献

Dual-approach co-expression analysis framework (D-CAF) enables identification of novel circadian co-regulation from multi-omic timeseries data.

BMC Bioinformatics. 2025 Mar 4;26(1):72. doi: 10.1186/s12859-025-06089-1.

Dual-Approach Co-expression Analysis Framework (D-CAF) Enables Identification of Novel Circadian Regulation From Multi-Omic Timeseries Data.

bioRxiv. 2024 Oct 14:2024.10.10.617622. doi: 10.1101/2024.10.10.617622.

Assessing and mitigating batch effects in large-scale omics studies.

Genome Biol. 2024 Oct 3;25(1):254. doi: 10.1186/s13059-024-03401-9.

Identification of potential biological processes and key genes in diabetes-related stroke through weighted gene co-expression network analysis.

BMC Med Genomics. 2024 Jan 2;17(1):8. doi: 10.1186/s12920-023-01752-z.

Multi-omics reveals largely distinct transcript- and protein-level responses to the environment in an intertidal mussel.

J Exp Biol. 2023 Nov 15;226(22). doi: 10.1242/jeb.245962. Epub 2023 Nov 21.

The PAICE suite reveals circadian posttranscriptional timing of noncoding RNAs and spliceosome components in Mus musculus macrophages.

G3 (Bethesda). 2022 Aug 25;12(9). doi: 10.1093/g3journal/jkac176.

Multiple Imputation Approaches Applied to the Missing Value Problem in Bottom-Up Proteomics.

Int J Mol Sci. 2021 Sep 6;22(17):9650. doi: 10.3390/ijms22179650.

MOSAIC: a joint modeling methodology for combined circadian and non-circadian analysis of multi-omics data.

Bioinformatics. 2021 May 5;37(6):767-774. doi: 10.1093/bioinformatics/btaa877.

ENCORE: A Visualization Tool for Insight into Circadian Omics.

ACM BCB. 2019 Sep;2019:5-14. doi: 10.1145/3307339.3342137.

Principles of the animal molecular clock learned from Neurospora.

Eur J Neurosci. 2020 Jan;51(1):19-33. doi: 10.1111/ejn.14354. Epub 2019 Feb 21.

本文引用的文献

Identifying global expression patterns and key regulators in epithelial to mesenchymal transition through multi-study integration.

BMC Cancer. 2017 Jun 26;17(1):447. doi: 10.1186/s12885-017-3413-3.

In-depth method assessments of differentially expressed protein detection for shotgun proteomics data with missing values.

Sci Rep. 2017 Jun 13;7(1):3367. doi: 10.1038/s41598-017-03650-8.

A ketogenic diet rescues hippocampal memory defects in a mouse model of Kabuki syndrome.

Proc Natl Acad Sci U S A. 2017 Jan 3;114(1):125-130. doi: 10.1073/pnas.1611431114. Epub 2016 Dec 20.

Phosphorylation Is a Central Mechanism for Circadian Control of Metabolism and Physiology.

Cell Metab. 2017 Jan 10;25(1):118-127. doi: 10.1016/j.cmet.2016.10.004. Epub 2016 Nov 3.

Nuclear Proteomics Uncovers Diurnal Regulatory Landscapes in Mouse Liver.

Cell Metab. 2017 Jan 10;25(1):102-117. doi: 10.1016/j.cmet.2016.10.003. Epub 2016 Nov 3.

Extraction and analysis of signatures from the Gene Expression Omnibus by the crowd.

Nat Commun. 2016 Sep 26;7:12846. doi: 10.1038/ncomms12846.

Defining the consequences of genetic variation on a proteome-wide scale.

Nature. 2016 Jun 23;534(7608):500-5. doi: 10.1038/nature18270. Epub 2016 Jun 15.

Practical impacts of genomic data "cleaning" on biological discovery using surrogate variable analysis.

BMC Bioinformatics. 2015 Nov 6;16:372. doi: 10.1186/s12859-015-0808-5.

Methods that remove batch effects while retaining group differences may lead to exaggerated confidence in downstream analyses.

Biostatistics. 2016 Jan;17(1):29-39. doi: 10.1093/biostatistics/kxv027. Epub 2015 Aug 13.

Improved statistical methods enable greater sensitivity in rhythm detection for genome-wide data.

PLoS Comput Biol. 2015 Mar 20;11(3):e1004094. doi: 10.1371/journal.pcbi.1004094. eCollection 2015 Mar.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于学习和插补的质谱偏倚降低方法（LIMBR）。

Learning and Imputation for Mass-spec Bias Reduction (LIMBR).

机构信息

Department of Molecular and Systems Biology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA.

Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.

出版信息

Bioinformatics. 2019 May 1;35(9):1518-1526. doi: 10.1093/bioinformatics/bty828.

DOI:10.1093/bioinformatics/bty828

PMID:30247517

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6499252/

Abstract

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

Python code and documentation is available for download at https://github.com/aleccrowell/LIMBR and LIMBR can be downloaded and installed with dependencies using 'pip install limbr'.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

结果

可用性和实现

Python 代码和文档可在 https://github.com/aleccrowell/LIMBR 上下载，并且可以使用 'pip install limbr' 下载并安装带有依赖项的 LIMBR。

补充信息

补充数据可在 Bioinformatics 在线获取。

基于学习和插补的质谱偏倚降低方法（LIMBR）。

Learning and Imputation for Mass-spec Bias Reduction (LIMBR).

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

SUPPLEMENTARY INFORMATION

动机

结果

可用性和实现

补充信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

基于学习和插补的质谱偏倚降低方法（LIMBR）。

Learning and Imputation for Mass-spec Bias Reduction (LIMBR).

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

SUPPLEMENTARY INFORMATION

动机

结果

可用性和实现

补充信息