• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过带正比率的优化局部加权散点平滑回归重新定义高可变基因。

Redefining the high variable genes by optimized LOESS regression with positive ratio.

作者信息

Xie Yue, Jing Zehua, Pan Hailin, Xu Xun, Fang Qi

机构信息

College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China.

BGI Research, Shenzhen, 518083, China.

出版信息

BMC Bioinformatics. 2025 Apr 15;26(1):104. doi: 10.1186/s12859-025-06112-5.

DOI:10.1186/s12859-025-06112-5
PMID:40234751
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12001687/
Abstract

BACKGROUND

Single-cell RNA sequencing allows for the exploration of transcriptomic features at the individual cell level, but the high dimensionality and sparsity of the data pose substantial challenges for downstream analysis. Feature selection, therefore, is a critical step to reduce dimensionality and enhance interpretability.

RESULTS

We developed a robust feature selection algorithm that leverages optimized locally estimated scatterplot smoothing regression (LOESS) to precisely capture the relationship between gene average expression level and positive ratio while minimizing overfitting. Our evaluations showed that our algorithm consistently outperforms eight leading feature selection methods across three benchmark criteria and helps improve downstream analysis, thus offering a significant improvement in gene subset selection.

CONCLUSIONS

By preserving key biological information through feature selection, GLP provides informative features to enhance the accuracy and effectiveness of downstream analyses.

摘要

背景

单细胞RNA测序能够在单个细胞水平上探索转录组特征,但数据的高维度和稀疏性给下游分析带来了巨大挑战。因此,特征选择是降低维度和增强可解释性的关键步骤。

结果

我们开发了一种强大的特征选择算法,该算法利用优化的局部估计散点图平滑回归(LOESS)来精确捕捉基因平均表达水平与阳性率之间的关系,同时将过拟合降至最低。我们的评估表明,在三个基准标准上,我们的算法始终优于八种领先的特征选择方法,并有助于改进下游分析,从而在基因子集选择方面有显著提升。

结论

通过特征选择保留关键生物学信息,GLP提供了信息丰富的特征,以提高下游分析的准确性和有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f4b/12001687/48b6116f64fc/12859_2025_6112_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f4b/12001687/a074c1f665d7/12859_2025_6112_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f4b/12001687/3b6ab71a9651/12859_2025_6112_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f4b/12001687/decfd70fe05e/12859_2025_6112_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f4b/12001687/48b6116f64fc/12859_2025_6112_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f4b/12001687/a074c1f665d7/12859_2025_6112_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f4b/12001687/3b6ab71a9651/12859_2025_6112_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f4b/12001687/decfd70fe05e/12859_2025_6112_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f4b/12001687/48b6116f64fc/12859_2025_6112_Fig4_HTML.jpg

相似文献

1
Redefining the high variable genes by optimized LOESS regression with positive ratio.通过带正比率的优化局部加权散点平滑回归重新定义高可变基因。
BMC Bioinformatics. 2025 Apr 15;26(1):104. doi: 10.1186/s12859-025-06112-5.
2
Mcadet: A feature selection method for fine-resolution single-cell RNA-seq data based on multiple correspondence analysis and community detection.基于多重对应分析和社区检测的精细分辨率单细胞 RNA-seq 数据特征选择方法
PLoS Comput Biol. 2024 Oct 28;20(10):e1012560. doi: 10.1371/journal.pcbi.1012560. eCollection 2024 Oct.
3
Feature selection methods affect the performance of scRNA-seq data integration and querying.特征选择方法会影响单细胞RNA测序(scRNA-seq)数据整合与查询的性能。
Nat Methods. 2025 Apr;22(4):834-844. doi: 10.1038/s41592-025-02624-3. Epub 2025 Mar 13.
4
SCMarker: Ab initio marker selection for single cell transcriptome profiling.SCMarker:单细胞转录组分析的从头标记选择。
PLoS Comput Biol. 2019 Oct 28;15(10):e1007445. doi: 10.1371/journal.pcbi.1007445. eCollection 2019 Oct.
5
Characterizing efficient feature selection for single-cell expression analysis.对单细胞表达分析中的高效特征选择进行刻画。
Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae317.
6
Highly Regional Genes: graph-based gene selection for single-cell RNA-seq data.高度区域性基因:基于图的单细胞 RNA-seq 数据基因选择。
J Genet Genomics. 2022 Sep;49(9):891-899. doi: 10.1016/j.jgg.2022.01.004. Epub 2022 Feb 8.
7
scVSC: Deep Variational Subspace Clustering for Single-Cell Transcriptome Data.scVSC:用于单细胞转录组数据的深度变分子空间聚类。
IEEE/ACM Trans Comput Biol Bioinform. 2024 Sep-Oct;21(5):1492-1503. doi: 10.1109/TCBB.2024.3405731. Epub 2024 Oct 9.
8
FastProject: a tool for low-dimensional analysis of single-cell RNA-Seq data.FastProject:一种用于单细胞RNA测序数据低维分析的工具。
BMC Bioinformatics. 2016 Aug 23;17(1):315. doi: 10.1186/s12859-016-1176-5.
9
scDFN: enhancing single-cell RNA-seq clustering with deep fusion networks.scDFN:利用深度融合网络增强单细胞 RNA-seq 聚类
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae486.
10
A hybrid deep clustering approach for robust cell type profiling using single-cell RNA-seq data.基于单细胞 RNA-seq 数据的混合深度聚类方法进行稳健的细胞类型分析。
RNA. 2020 Oct;26(10):1303-1319. doi: 10.1261/rna.074427.119. Epub 2020 Jun 12.

本文引用的文献

1
DELVE: feature selection for preserving biological trajectories in single-cell data.DELVE:单细胞数据中保留生物轨迹的特征选择。
Nat Commun. 2024 Mar 29;15(1):2765. doi: 10.1038/s41467-024-46773-z.
2
Characterization of CD4 and CD8 T cells responses in the mixed lymphocyte reaction by flow cytometry and single cell RNA sequencing.通过流式细胞术和单细胞 RNA 测序分析混合淋巴细胞反应中 CD4 和 CD8 T 细胞反应的特征。
Front Immunol. 2024 Jan 12;14:1320481. doi: 10.3389/fimmu.2023.1320481. eCollection 2023.
3
Generation of transgene-free hematopoietic stem cells from human induced pluripotent stem cells.
从人诱导多能干细胞生成无转基因造血干细胞。
Cell Stem Cell. 2023 Dec 7;30(12):1610-1623.e7. doi: 10.1016/j.stem.2023.11.002.
4
A longitudinal single-cell atlas of treatment response in pediatric AML.儿童急性髓系白血病治疗反应的纵向单细胞图谱。
Cancer Cell. 2023 Dec 11;41(12):2117-2135.e12. doi: 10.1016/j.ccell.2023.10.008. Epub 2023 Nov 16.
5
Morphogenesis and development of human telencephalic organoids in the absence and presence of exogenous extracellular matrix.人类端脑类器官在有无外源性细胞外基质情况下的形态发生和发育。
EMBO J. 2023 Nov 15;42(22):e113213. doi: 10.15252/embj.2022113213. Epub 2023 Oct 16.
6
Single-cell DNA methylation and 3D genome architecture in the human brain.人类大脑中的单细胞 DNA 甲基化和 3D 基因组结构。
Science. 2023 Oct 13;382(6667):eadf5357. doi: 10.1126/science.adf5357.
7
Transcriptomic diversity of cell types across the adult human brain.成人脑中细胞类型的转录组多样性。
Science. 2023 Oct 13;382(6667):eadd7046. doi: 10.1126/science.add7046.
8
Gluten induces rapid reprogramming of natural memory αβ and γδ intraepithelial T cells to induce cytotoxicity in celiac disease.麸质诱导乳糜泻中天然记忆性αβ和γδ上皮内 T 细胞的快速重编程,从而诱导细胞毒性。
Sci Immunol. 2023 Jul 21;8(85):eadf4312. doi: 10.1126/sciimmunol.adf4312. Epub 2023 Jul 14.
9
Characterizing the landscape of gene expression variance in humans.描述人类基因表达方差的特征。
PLoS Genet. 2023 Jul 6;19(7):e1010833. doi: 10.1371/journal.pgen.1010833. eCollection 2023 Jul.
10
CellBRF: a feature selection method for single-cell clustering using cell balance and random forest.CellBRF:一种基于细胞平衡和随机森林的单细胞聚类特征选择方法。
Bioinformatics. 2023 Jun 30;39(39 Suppl 1):i368-i376. doi: 10.1093/bioinformatics/btad216.