Suppr超能文献

通过带正比率的优化局部加权散点平滑回归重新定义高可变基因。

Redefining the high variable genes by optimized LOESS regression with positive ratio.

作者信息

Xie Yue, Jing Zehua, Pan Hailin, Xu Xun, Fang Qi

机构信息

College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China.

BGI Research, Shenzhen, 518083, China.

出版信息

BMC Bioinformatics. 2025 Apr 15;26(1):104. doi: 10.1186/s12859-025-06112-5.

Abstract

BACKGROUND

Single-cell RNA sequencing allows for the exploration of transcriptomic features at the individual cell level, but the high dimensionality and sparsity of the data pose substantial challenges for downstream analysis. Feature selection, therefore, is a critical step to reduce dimensionality and enhance interpretability.

RESULTS

We developed a robust feature selection algorithm that leverages optimized locally estimated scatterplot smoothing regression (LOESS) to precisely capture the relationship between gene average expression level and positive ratio while minimizing overfitting. Our evaluations showed that our algorithm consistently outperforms eight leading feature selection methods across three benchmark criteria and helps improve downstream analysis, thus offering a significant improvement in gene subset selection.

CONCLUSIONS

By preserving key biological information through feature selection, GLP provides informative features to enhance the accuracy and effectiveness of downstream analyses.

摘要

背景

单细胞RNA测序能够在单个细胞水平上探索转录组特征,但数据的高维度和稀疏性给下游分析带来了巨大挑战。因此,特征选择是降低维度和增强可解释性的关键步骤。

结果

我们开发了一种强大的特征选择算法,该算法利用优化的局部估计散点图平滑回归(LOESS)来精确捕捉基因平均表达水平与阳性率之间的关系,同时将过拟合降至最低。我们的评估表明,在三个基准标准上,我们的算法始终优于八种领先的特征选择方法,并有助于改进下游分析,从而在基因子集选择方面有显著提升。

结论

通过特征选择保留关键生物学信息,GLP提供了信息丰富的特征,以提高下游分析的准确性和有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f4b/12001687/a074c1f665d7/12859_2025_6112_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验