• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

将困惑度作为人类转录组中异构体多样性的一个指标。

Perplexity as a Metric for Isoform Diversity in the Human Transcriptome.

作者信息

Schertzer Megan D, Park Stella H, Su Jiayu, Sheynkman Gloria M, Knowles David A

机构信息

New York Genome Center, New York, NY.

Department of Computer Science, Columbia University, New York, NY.

出版信息

bioRxiv. 2025 Jul 2:2025.07.02.662769. doi: 10.1101/2025.07.02.662769.

DOI:10.1101/2025.07.02.662769
PMID:40631152
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12236620/
Abstract

Long-read sequencing (LRS) has revealed a far greater diversity of RNA isoforms than earlier technologies, increasing the critical need to determine which, and how many, isoforms per gene are biologically meaningful. To define the space of relevant isoforms from LRS, many existing analysis pipelines rely on arbitrary expression cutoffs, but a single threshold cannot accommodate the broad variability in isoform complexity across genes, cell-types, and disease states captured by LRS. To address this, we propose using -an interpretable measure derived from entropy-that quantifies the effective number of isoforms per gene based on the full, unfiltered isoform ratio distribution. Calculating perplexity for 124 ENCODE4 PacBio LRS datasets spanning 55 human cell types, we show that it provides intuitive assessments of isoform diversity and captures uncertainty across genes with varying complexity. Perplexity can be calculated at multiple gene regulatory levels-from transcript to protein-to compare how isoform diversity is reduced across stages of gene expression. On average, genes have an ORF-level perplexity of 2.1, indicating production of two distinct protein isoforms. We extended this analysis to evaluate expression variation across tissues and identified 4,593 ORFs across 3,102 genes with moderate to extreme tissue-specificity. We propose perplexity as a consistent, quantitative metric for interpreting isoform diversity across genes, cell types, and disease states. All results are compiled into a community resource to enable cross-study comparisons of novel isoforms.

摘要

长读长测序(LRS)揭示的RNA异构体多样性比早期技术要多得多,这使得确定每个基因中哪些异构体以及有多少异构体具有生物学意义变得愈发迫切。为了从LRS中定义相关异构体的空间,许多现有的分析流程依赖于任意的表达阈值,但单一阈值无法适应LRS所捕获的基因、细胞类型和疾病状态中异构体复杂性的广泛差异。为了解决这个问题,我们建议使用一种从熵推导而来的可解释度量,该度量基于完整的、未过滤的异构体比例分布来量化每个基因的有效异构体数量。通过计算来自55种人类细胞类型的124个ENCODE4 PacBio LRS数据集的困惑度,我们表明它提供了对异构体多样性的直观评估,并捕捉了不同复杂性基因的不确定性。困惑度可以在多个基因调控水平上计算——从转录本到蛋白质——以比较异构体多样性在基因表达各阶段是如何降低的。平均而言,基因的开放阅读框(ORF)水平困惑度为2.1,这表明产生了两种不同的蛋白质异构体。我们扩展了这项分析以评估不同组织间的表达变异,并在3102个基因中鉴定出4593个具有中度至极端组织特异性的ORF。我们建议将困惑度作为一种一致的定量指标,用于解释跨基因、细胞类型和疾病状态的异构体多样性。所有结果都被汇编成一个社区资源,以实现对新型异构体的跨研究比较。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7781/12236620/bb4d6b823e29/nihpp-2025.07.02.662769v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7781/12236620/4e7b59b17667/nihpp-2025.07.02.662769v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7781/12236620/27f8a5aef0bb/nihpp-2025.07.02.662769v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7781/12236620/bb4d6b823e29/nihpp-2025.07.02.662769v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7781/12236620/4e7b59b17667/nihpp-2025.07.02.662769v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7781/12236620/27f8a5aef0bb/nihpp-2025.07.02.662769v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7781/12236620/bb4d6b823e29/nihpp-2025.07.02.662769v1-f0003.jpg

相似文献

1
Perplexity as a Metric for Isoform Diversity in the Human Transcriptome.将困惑度作为人类转录组中异构体多样性的一个指标。
bioRxiv. 2025 Jul 2:2025.07.02.662769. doi: 10.1101/2025.07.02.662769.
2
The Black Book of Psychotropic Dosing and Monitoring.《精神药物剂量与监测黑皮书》
Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.
3
A rapid and systematic review of the clinical effectiveness and cost-effectiveness of paclitaxel, docetaxel, gemcitabine and vinorelbine in non-small-cell lung cancer.对紫杉醇、多西他赛、吉西他滨和长春瑞滨在非小细胞肺癌中的临床疗效和成本效益进行的快速系统评价。
Health Technol Assess. 2001;5(32):1-195. doi: 10.3310/hta5320.
4
Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.系统性药理学治疗慢性斑块状银屑病:网络荟萃分析。
Cochrane Database Syst Rev. 2021 Apr 19;4(4):CD011535. doi: 10.1002/14651858.CD011535.pub4.
5
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
6
Systemic Inflammatory Response Syndrome全身炎症反应综合征
7
Short-Term Memory Impairment短期记忆障碍
8
Falls prevention interventions for community-dwelling older adults: systematic review and meta-analysis of benefits, harms, and patient values and preferences.社区居住的老年人跌倒预防干预措施:系统评价和荟萃分析的益处、危害以及患者的价值观和偏好。
Syst Rev. 2024 Nov 26;13(1):289. doi: 10.1186/s13643-024-02681-3.
9
Antidepressants for pain management in adults with chronic pain: a network meta-analysis.抗抑郁药治疗成人慢性疼痛的疼痛管理:一项网络荟萃分析。
Health Technol Assess. 2024 Oct;28(62):1-155. doi: 10.3310/MKRT2948.
10
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.

本文引用的文献

1
Transcriptomics in the era of long-read sequencing.长读长测序时代的转录组学
Nat Rev Genet. 2025 Mar 28. doi: 10.1038/s41576-025-00828-z.
2
Widespread variation in molecular interactions and regulatory properties among transcription factor isoforms.转录因子亚型之间分子相互作用和调控特性的广泛差异。
Mol Cell. 2025 Apr 3;85(7):1445-1466.e13. doi: 10.1016/j.molcel.2025.03.004. Epub 2025 Mar 26.
3
Long-read RNA sequencing atlas of human microglia isoforms elucidates disease-associated genetic regulation of splicing.人类小胶质细胞异构体的长读长RNA测序图谱阐明了与疾病相关的剪接基因调控。
Nat Genet. 2025 Mar;57(3):604-615. doi: 10.1038/s41588-025-02099-0. Epub 2025 Mar 3.
4
Challenges in identifying mRNA transcript starts and ends from long-read sequencing data.从长读测序数据中识别 mRNA 转录本起始和结束的挑战。
Genome Res. 2024 Nov 20;34(11):1719-1734. doi: 10.1101/gr.279559.124.
5
Global impact of unproductive splicing on human gene expression.无功能剪接对人类基因表达的全球影响。
Nat Genet. 2024 Sep;56(9):1851-1861. doi: 10.1038/s41588-024-01872-x. Epub 2024 Sep 2.
6
Systematic assessment of long-read RNA-seq methods for transcript identification and quantification.系统评估长读 RNA-seq 方法在转录本鉴定和定量中的应用。
Nat Methods. 2024 Jul;21(7):1349-1363. doi: 10.1038/s41592-024-02298-3. Epub 2024 Jun 7.
7
CSDE1: a versatile regulator of gene expression in cancer.CSDE1:癌症中基因表达的多功能调节因子。
NAR Cancer. 2024 Apr 10;6(2):zcae014. doi: 10.1093/narcan/zcae014. eCollection 2024 Jun.
8
SQANTI3: curation of long-read transcriptomes for accurate identification of known and novel isoforms.SQANTI3:长读转录组的编目,用于准确识别已知和新的异构体。
Nat Methods. 2024 May;21(5):793-797. doi: 10.1038/s41592-024-02229-2. Epub 2024 Mar 20.
9
High-throughput RNA isoform sequencing using programmed cDNA concatenation.使用可编程 cDNA 连接的高通量 RNA 异构体测序。
Nat Biotechnol. 2024 Apr;42(4):582-586. doi: 10.1038/s41587-023-01815-7. Epub 2023 Jun 8.
10
Protocol for analyzing intact mRNA poly(A) tail length using nanopore direct RNA sequencing.使用纳米孔直接RNA测序分析完整mRNA聚腺苷酸(poly(A))尾长度的方案。
STAR Protoc. 2023 May 26;4(2):102340. doi: 10.1016/j.xpro.2023.102340.