Suppr超能文献

RefBool:一种基于参考的基因表达数据离散化算法。

RefBool: a reference-based algorithm for discretizing gene expression data.

机构信息

Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Belvaux, Luxembourg.

出版信息

Bioinformatics. 2017 Jul 1;33(13):1953-1962. doi: 10.1093/bioinformatics/btx111.

Abstract

MOTIVATION

The identification of genes or molecular regulatory mechanisms implicated in biological processes often requires the discretization, and in particular booleanization, of gene expression measurements. However, currently used methods mostly classify each measurement into an active or inactive state regardless of its statistical support possibly leading to downstream analysis conclusions based on spurious booleanization results.

RESULTS

In order to overcome the lack of certainty inherent in current methodologies and to improve the process of discretization, we introduce RefBool, a reference-based algorithm for discretizing gene expression data. Instead of requiring each measurement to be classified as active or inactive, RefBool allows for the classification of a third state that can be interpreted as an intermediate expression of genes. Furthermore, each measurement is associated to a p- and q-value indicating the significance of each classification. Validation of RefBool on a neuroepithelial differentiation study and subsequent qualitative and quantitative comparison against 10 currently used methods supports its advantages and shows clear improvements of resulting clusterings.

AVAILABILITY AND IMPLEMENTATION

The software is available as MATLAB files in the Supplementary Information and as an online repository ( https://github.com/saschajung/RefBool ).

CONTACT

antonio.delsol@uni.lu.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

在生物过程中,识别基因或分子调控机制通常需要对基因表达测量进行离散化,特别是布尔化。然而,目前使用的方法大多将每个测量值分类为活动或不活动状态,而不考虑其统计支持,这可能导致基于虚假布尔化结果的下游分析结论。

结果

为了克服当前方法中固有的不确定性,并改进离散化过程,我们引入了 RefBool,这是一种用于离散化基因表达数据的基于参考的算法。RefBool 不要求对每个测量值进行分类为活动或不活动,而是允许对可解释为基因中间表达的第三种状态进行分类。此外,每个测量值都与 p 值和 q 值相关联,指示每个分类的显著性。在神经上皮分化研究中对 RefBool 的验证,以及随后与 10 种当前使用的方法进行定性和定量比较,支持了它的优势,并显示了聚类结果的明显改进。

可用性和实现

该软件以 MATLAB 文件的形式在补充信息中提供,并作为在线存储库(https://github.com/saschajung/RefBool)提供。

联系方式

antonio.delsol@uni.lu

补充信息

补充数据可在生物信息学在线获得。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验