• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用个体条形码提高大规模平行报告基因检测的定量能力。

Using individual barcodes to increase quantification power of massively parallel reporter assays.

作者信息

Keukeleire Pia, Rosen Jonathan D, Göbel-Knapp Angelina, Salomon Kilian, Schubach Max, Kircher Martin

机构信息

Institute of Human Genetics, University Hospital Schleswig-Holstein, University of Lübeck, Lübeck, Germany.

Department of Genetics & Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.

出版信息

BMC Bioinformatics. 2025 Feb 13;26(1):52. doi: 10.1186/s12859-025-06065-9.

DOI:10.1186/s12859-025-06065-9
PMID:39948460
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11827149/
Abstract

BACKGROUND

Massively parallel reporter assays (MPRAs) are an experimental technology for measuring the activity of thousands of candidate regulatory sequences or their variants in parallel, where the activity of individual sequences is measured from pools of sequence-tagged reporter genes. Activity is derived from the ratio of transcribed RNA to input DNA counts of associated tag sequences in each reporter construct, so-called barcodes. Recently, tools specifically designed to analyze MPRA data were developed that attempt to model the count data, accounting for its inherent variation. Of these tools, MPRAnalyze and mpralm are most widely used. MPRAnalyze models barcode counts to estimate the transcription rate of each sequence. While it has increased statistical power and robustness against outliers compared to mpralm, it is slow and has a high false discovery rate. Mpralm, a tool built on the R package Limma, estimates log fold-changes between different sequences. As opposed to MPRAnalyze, it is fast and has a low false discovery rate but is susceptible to outliers and has less statistical power.

RESULTS

We propose BCalm, an MPRA analysis framework aimed at addressing the limitations of the existing tools. BCalm is an adaptation of mpralm, but models individual barcode counts instead of aggregating counts per sequence. Leaving out the aggregation step increases statistical power and improves robustness to outliers, while being fast and precise. We show the improved performance over existing methods on both simulated MPRA data and a lentiviral MPRA library of 166,508 target sequences, including 82,258 allelic variants. Further, BCalm adds functionality beyond the existing mpralm package, such as preparing count input files from MPRAsnakeflow, as well as an option to test for sequences with enhancing or repressing activity. Its built-in plotting functionalities allow for easy interpretation of the results.

CONCLUSIONS

With BCalm, we provide a new tool for analyzing MPRA data which is robust and accurate on real MPRA datasets. The package is available at https://github.com/kircherlab/BCalm .

摘要

背景

大规模平行报告基因检测(MPRAs)是一种实验技术,可用于并行测量数千个候选调控序列或其变体的活性,其中单个序列的活性是从带有序列标签的报告基因库中测量的。活性由每个报告基因构建体中相关标签序列的转录RNA与输入DNA计数的比率得出,即所谓的条形码。最近,专门设计用于分析MPRA数据的工具被开发出来,这些工具试图对计数数据进行建模,并考虑到其固有的变异性。在这些工具中,MPRAnalyze和mpralm使用最为广泛。MPRAnalyze对条形码计数进行建模,以估计每个序列的转录速率。与mpralm相比,它具有更高的统计功效和对异常值的鲁棒性,但速度较慢且错误发现率较高。Mpralm是一个基于R包Limma构建的工具,用于估计不同序列之间的对数倍变化。与MPRAnalyze不同,它速度快且错误发现率低,但容易受到异常值的影响且统计功效较低。

结果

我们提出了BCalm,这是一个旨在解决现有工具局限性的MPRA分析框架。BCalm是mpralm的一种改进,但它对单个条形码计数进行建模,而不是对每个序列的计数进行汇总。省略汇总步骤可提高统计功效并增强对异常值的鲁棒性,同时快速且精确。我们在模拟的MPRA数据和包含166,508个靶序列(包括82,258个等位基因变体)的慢病毒MPRA文库上展示了其相对于现有方法的改进性能。此外,BCalm还增加了现有mpralm包之外的功能,例如从MPRAsnakeflow准备计数输入文件,以及测试具有增强或抑制活性的序列的选项。其内置的绘图功能便于对结果进行解释。

结论

通过BCalm,我们提供了一种用于分析MPRA数据的新工具,该工具在真实的MPRA数据集上既稳健又准确。该软件包可在https://github.com/kircherlab/BCalm获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6882/11827149/089782b80530/12859_2025_6065_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6882/11827149/be7a3509ed4c/12859_2025_6065_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6882/11827149/7e247927ef3a/12859_2025_6065_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6882/11827149/faa3570e219d/12859_2025_6065_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6882/11827149/740687675878/12859_2025_6065_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6882/11827149/089782b80530/12859_2025_6065_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6882/11827149/be7a3509ed4c/12859_2025_6065_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6882/11827149/7e247927ef3a/12859_2025_6065_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6882/11827149/faa3570e219d/12859_2025_6065_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6882/11827149/740687675878/12859_2025_6065_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6882/11827149/089782b80530/12859_2025_6065_Fig5_HTML.jpg

相似文献

1
Using individual barcodes to increase quantification power of massively parallel reporter assays.使用个体条形码提高大规模平行报告基因检测的定量能力。
BMC Bioinformatics. 2025 Feb 13;26(1):52. doi: 10.1186/s12859-025-06065-9.
2
MPRAnalyze: statistical framework for massively parallel reporter assays.MPRAnalyze:大规模平行报告基因分析的统计框架。
Genome Biol. 2019 Sep 2;20(1):183. doi: 10.1186/s13059-019-1787-z.
3
Bayesian modelling of high-throughput sequencing assays with malacoda.使用 Malacoda 对高通量测序检测进行贝叶斯建模。
PLoS Comput Biol. 2020 Jul 21;16(7):e1007504. doi: 10.1371/journal.pcbi.1007504. eCollection 2020 Jul.
4
Linear models enable powerful differential activity analysis in massively parallel reporter assays.线性模型使大规模平行报告基因检测中的强大差异活性分析成为可能。
BMC Genomics. 2019 Mar 12;20(1):209. doi: 10.1186/s12864-019-5556-x.
5
QuASAR-MPRA: accurate allele-specific analysis for massively parallel reporter assays.QuASAR-MPRA:用于大规模平行报告分析的精确等位基因特异性分析。
Bioinformatics. 2018 Mar 1;34(5):787-794. doi: 10.1093/bioinformatics/btx598.
6
Statistical considerations for the analysis of massively parallel reporter assays data.大规模平行报告基因检测数据分析的统计学考虑。
Genet Epidemiol. 2020 Oct;44(7):785-794. doi: 10.1002/gepi.22337. Epub 2020 Jul 18.
7
Deciphering regulatory DNA sequences and noncoding genetic variants using neural network models of massively parallel reporter assays.利用大规模平行报告基因实验的神经网络模型来破译调控 DNA 序列和非编码遗传变异。
PLoS One. 2019 Jun 17;14(6):e0218073. doi: 10.1371/journal.pone.0218073. eCollection 2019.
8
Design tools for MPRA experiments.MPRA 实验的设计工具。
Bioinformatics. 2018 Aug 1;34(15):2682-2683. doi: 10.1093/bioinformatics/bty150.
9
Sequence-based correction of barcode bias in massively parallel reporter assays.基于序列的大规模平行报告分析中条形码偏倚的校正。
Genome Res. 2021 Sep;31(9):1638-1645. doi: 10.1101/gr.268599.120. Epub 2021 Jul 20.
10
A systematic evaluation of the design and context dependencies of massively parallel reporter assays.大规模平行报告基因检测设计与背景依赖性的系统评价。
Nat Methods. 2020 Nov;17(11):1083-1091. doi: 10.1038/s41592-020-0965-y. Epub 2020 Oct 12.

引用本文的文献

1
Massively parallel reporter assays and mouse transgenic assays provide correlated and complementary information about neuronal enhancer activity.大规模平行报告基因检测和小鼠转基因检测提供了关于神经元增强子活性的相关且互补的信息。
Nat Commun. 2025 May 23;16(1):4786. doi: 10.1038/s41467-025-60064-1.

本文引用的文献

1
Massively parallel characterization of transcriptional regulatory elements.转录调控元件的大规模并行表征
Nature. 2025 Mar;639(8054):411-420. doi: 10.1038/s41586-024-08430-9. Epub 2025 Jan 15.
2
Deciphering the impact of genomic variation on function.解读基因组变异对功能的影响。
Nature. 2024 Sep;633(8028):47-57. doi: 10.1038/s41586-024-07510-0. Epub 2024 Sep 4.
3
Massively parallel characterization of regulatory elements in the developing human cortex.大规模平行分析人类大脑皮层发育过程中的调控元件。
Science. 2024 May 24;384(6698):eadh0559. doi: 10.1126/science.adh0559.
4
CADD v1.7: using protein language models, regulatory CNNs and other nucleotide-level scores to improve genome-wide variant predictions.CADD v1.7:利用蛋白质语言模型、调控 CNN 以及其他核苷酸水平的评分来提高全基因组变异预测的准确性。
Nucleic Acids Res. 2024 Jan 5;52(D1):D1143-D1154. doi: 10.1093/nar/gkad989.
5
Focus on your locus with a massively parallel reporter assay.关注你的焦点,使用大规模平行报告基因检测。
J Neurodev Disord. 2022 Sep 9;14(1):50. doi: 10.1186/s11689-022-09461-x.
6
Sequence-based correction of barcode bias in massively parallel reporter assays.基于序列的大规模平行报告分析中条形码偏倚的校正。
Genome Res. 2021 Sep;31(9):1638-1645. doi: 10.1101/gr.268599.120. Epub 2021 Jul 20.
7
NEBULA is a fast negative binomial mixed model for differential or co-expression analysis of large-scale multi-subject single-cell data.NEBULA 是一个快速的负二项式混合模型,用于大规模多主体单细胞数据的差异或共表达分析。
Commun Biol. 2021 May 26;4(1):629. doi: 10.1038/s42003-021-02146-6.
8
A practical solution to pseudoreplication bias in single-cell studies.单细胞研究中拟似重复偏倚的实用解决方案。
Nat Commun. 2021 Feb 2;12(1):738. doi: 10.1038/s41467-021-21038-1.
9
Bayesian modelling of high-throughput sequencing assays with malacoda.使用 Malacoda 对高通量测序检测进行贝叶斯建模。
PLoS Comput Biol. 2020 Jul 21;16(7):e1007504. doi: 10.1371/journal.pcbi.1007504. eCollection 2020 Jul.
10
Statistical considerations for the analysis of massively parallel reporter assays data.大规模平行报告基因检测数据分析的统计学考虑。
Genet Epidemiol. 2020 Oct;44(7):785-794. doi: 10.1002/gepi.22337. Epub 2020 Jul 18.