• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

EBT:一种统计检验方法,用于鉴定具有平衡功效和精度的中等大小的显著特征,以进行全基因组速率比较。

EBT: a statistic test identifying moderate size of significant features with balanced power and precision for genome-wide rate comparisons.

机构信息

Department of Cell Biology and Genetics, School of Basic Medical Sciences, Shenzhen University Health Science Center, Shenzhen 518060, China.

Epigenomics and Computational Biology Lab, Virginia Bioinformatics Institute, Virginia Tech, Blacksburg, VA 24060, USA.

出版信息

Bioinformatics. 2017 Sep 1;33(17):2631-2641. doi: 10.1093/bioinformatics/btx294.

DOI:10.1093/bioinformatics/btx294
PMID:28472273
Abstract

MOTIVATION

In genome-wide rate comparison studies, there is a big challenge for effective identification of an appropriate number of significant features objectively, since traditional statistical comparisons without multi-testing correction can generate a large number of false positives while multi-testing correction tremendously decreases the statistic power.

RESULTS

In this study, we proposed a new exact test based on the translation of rate comparison to two binomial distributions. With modeling and real datasets, the exact binomial test (EBT) showed an advantage in balancing the statistical precision and power, by providing an appropriate size of significant features for further studies. Both correlation analysis and bootstrapping tests demonstrated that EBT is as robust as the typical rate-comparison methods, e.g. χ 2 test, Fisher's exact test and Binomial test. Performance comparison among machine learning models with features identified by different statistical tests further demonstrated the advantage of EBT. The new test was also applied to analyze the genome-wide somatic gene mutation rate difference between lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC), two main lung cancer subtypes and a list of new markers were identified that could be lineage-specifically associated with carcinogenesis of LUAD and LUSC, respectively. Interestingly, three cilia genes were found selectively with high mutation rates in LUSC, possibly implying the importance of cilia dysfunction in the carcinogenesis.

AVAILABILITY AND IMPLEMENTATION

An R package implementing EBT could be downloaded from the website freely: http://www.szu-bioinf.org/EBT .

CONTACT

wangyj@szu.edu.cn.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

在全基因组速率比较研究中,有效识别适当数量的显著特征是一个巨大的挑战,因为传统的统计比较如果没有多重检验校正,可能会产生大量的假阳性,而多重检验校正则会极大地降低统计功效。

结果

在这项研究中,我们提出了一种新的基于速率比较到两个二项分布的翻译的精确检验。通过建模和真实数据集,精确二项式检验(EBT)在平衡统计精度和功效方面具有优势,为进一步的研究提供了适当数量的显著特征。相关性分析和自举检验都表明,EBT 与典型的速率比较方法(如卡方检验、Fisher 精确检验和二项式检验)一样稳健。用不同统计检验方法识别特征的机器学习模型的性能比较进一步证明了 EBT 的优势。该新检验还应用于分析肺腺癌(LUAD)和肺鳞状细胞癌(LUSC)两种主要肺癌亚型之间全基因组体细胞基因突变率的差异,鉴定出了一系列新的标记物,这些标记物可能分别与 LUAD 和 LUSC 的癌变具有谱系特异性相关。有趣的是,在 LUSC 中发现了三个纤毛基因,其突变率选择性地较高,这可能意味着纤毛功能障碍在癌变中的重要性。

可用性和实施

可从网站免费下载实现 EBT 的 R 包:http://www.szu-bioinf.org/EBT。

联系人

wangyj@szu.edu.cn。

补充信息

补充数据可在《生物信息学》在线获取。

相似文献

1
EBT: a statistic test identifying moderate size of significant features with balanced power and precision for genome-wide rate comparisons.EBT:一种统计检验方法,用于鉴定具有平衡功效和精度的中等大小的显著特征,以进行全基因组速率比较。
Bioinformatics. 2017 Sep 1;33(17):2631-2641. doi: 10.1093/bioinformatics/btx294.
2
Exploring and comparing of the gene expression and methylation differences between lung adenocarcinoma and squamous cell carcinoma.探索并比较肺腺癌和肺鳞癌之间的基因表达和甲基化差异。
J Cell Physiol. 2019 Apr;234(4):4454-4459. doi: 10.1002/jcp.27240. Epub 2018 Oct 14.
3
The genomic alterations of lung adenocarcinoma and lung squamous cell carcinoma can explain the differences of their overall survival rates.肺腺癌和肺鳞癌的基因组改变可以解释其总生存率的差异。
J Cell Physiol. 2019 Jul;234(7):10918-10925. doi: 10.1002/jcp.27917. Epub 2018 Dec 13.
4
LncRNAs are altered in lung squamous cell carcinoma and lung adenocarcinoma.长链非编码RNA在肺鳞状细胞癌和肺腺癌中发生改变。
Oncotarget. 2017 Apr 11;8(15):24275-24291. doi: 10.18632/oncotarget.13651.
5
Prognostic alternative mRNA splicing signature in non-small cell lung cancer.非小细胞肺癌中的预后性可变mRNA剪接特征
Cancer Lett. 2017 May 1;393:40-51. doi: 10.1016/j.canlet.2017.02.016. Epub 2017 Feb 20.
6
Cancer Stemness-Based Prognostic Immune-Related Gene Signatures in Lung Adenocarcinoma and Lung Squamous Cell Carcinoma.基于癌症干性的肺腺癌和肺鳞癌预后免疫相关基因特征。
Front Endocrinol (Lausanne). 2021 Oct 21;12:755805. doi: 10.3389/fendo.2021.755805. eCollection 2021.
7
Identification of Immune-Related Gene Signatures in Lung Adenocarcinoma and Lung Squamous Cell Carcinoma.肺腺癌和肺鳞状细胞癌中免疫相关基因特征的鉴定
Front Immunol. 2021 Nov 23;12:752643. doi: 10.3389/fimmu.2021.752643. eCollection 2021.
8
Bioinformatics analyses of the differences between lung adenocarcinoma and squamous cell carcinoma using The Cancer Genome Atlas expression data.利用癌症基因组图谱表达数据对肺腺癌和肺鳞状细胞癌之间差异的生物信息学分析。
Mol Med Rep. 2017 Jul;16(1):609-616. doi: 10.3892/mmr.2017.6629. Epub 2017 May 25.
9
A prognosis-related molecular subtype for early-stage non-small lung cell carcinoma by multi-omics integration analysis.多组学整合分析早期非小细胞肺癌的预后相关分子亚型。
BMC Cancer. 2021 Feb 6;21(1):128. doi: 10.1186/s12885-021-07846-0.
10
Bioinformatics analysis of differentially expressed miRNAs in non-small cell lung cancer.非小细胞肺癌差异表达 miRNA 的生物信息学分析。
J Clin Lab Anal. 2021 Feb;35(2):e23588. doi: 10.1002/jcla.23588. Epub 2020 Sep 23.

引用本文的文献

1
T4SEpp: A pipeline integrating protein language models to predict bacterial type IV secreted effectors.T4SEpp:一种整合蛋白质语言模型以预测细菌IV型分泌效应蛋白的流程。
Comput Struct Biotechnol J. 2024 Jan 23;23:801-812. doi: 10.1016/j.csbj.2024.01.015. eCollection 2024 Dec.
2
Machine Learning for Lung Cancer Diagnosis, Treatment, and Prognosis.机器学习在肺癌诊断、治疗和预后中的应用。
Genomics Proteomics Bioinformatics. 2022 Oct;20(5):850-866. doi: 10.1016/j.gpb.2022.11.003. Epub 2022 Dec 1.
3
T1SEstacker: A Tri-Layer Stacking Model Effectively Predicts Bacterial Type 1 Secreted Proteins Based on C-Terminal Non-repeats-in-Toxin-Motif Sequence Features.
T1SEstacker:一种基于毒素基序序列特征中C端非重复序列的三层堆叠模型,可有效预测细菌1型分泌蛋白。
Front Microbiol. 2022 Feb 8;12:813094. doi: 10.3389/fmicb.2021.813094. eCollection 2021.
4
A Multi-Gene Model Effectively Predicts the Overall Prognosis of Stomach Adenocarcinomas With Large Genetic Heterogeneity Using Somatic Mutation Features.一种多基因模型利用体细胞突变特征有效预测具有高度遗传异质性的胃腺癌的总体预后。
Front Genet. 2020 Aug 26;11:940. doi: 10.3389/fgene.2020.00940. eCollection 2020.
5
T3SEpp: an Integrated Prediction Pipeline for Bacterial Type III Secreted Effectors.T3SEpp:一种用于细菌III型分泌效应蛋白的综合预测流程
mSystems. 2020 Aug 4;5(4):e00288-20. doi: 10.1128/mSystems.00288-20.
6
LUADpp: an effective prediction model on prognosis of lung adenocarcinomas based on somatic mutational features.LUADpp:基于体细胞突变特征的肺腺癌预后有效预测模型。
BMC Cancer. 2019 Mar 22;19(1):263. doi: 10.1186/s12885-019-5433-7.
7
Association of specific gene mutations derived from machine learning with survival in lung adenocarcinoma.基于机器学习的特定基因突变与肺腺癌患者生存的相关性研究。
PLoS One. 2018 Nov 12;13(11):e0207204. doi: 10.1371/journal.pone.0207204. eCollection 2018.
8
Combination of Genetic Markers and Age Effectively Facilitates the Identification of People with High Risk of Preeclampsia in the Han Chinese Population.遗传标志物与年龄的联合应用可有效鉴定汉族人群中子痫前期高危人群。
Biomed Res Int. 2018 Jul 19;2018:4808046. doi: 10.1155/2018/4808046. eCollection 2018.
9
Improvement in prediction of prostate cancer prognosis with somatic mutational signatures.利用体细胞突变特征改善前列腺癌预后预测
J Cancer. 2017 Sep 15;8(16):3261-3267. doi: 10.7150/jca.21261. eCollection 2017.