• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用特征选择方法从混合单细胞测序数据中鉴定2型糖尿病生物标志物

Identification of Type 2 Diabetes Biomarkers From Mixed Single-Cell Sequencing Data With Feature Selection Methods.

作者信息

Li Zhandong, Pan Xiaoyong, Cai Yu-Dong

机构信息

College of Biological and Food Engineering, Jilin Engineering Normal University, Changchun, China.

Key Laboratory of System Control and Information Processing, Institute of Image Processing and Pattern Recognition, Ministry of Education of China, Shanghai Jiao Tong University, Shanghai, China.

出版信息

Front Bioeng Biotechnol. 2022 Jun 2;10:890901. doi: 10.3389/fbioe.2022.890901. eCollection 2022.

DOI:10.3389/fbioe.2022.890901
PMID:35721855
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9201257/
Abstract

Diabetes is the most common disease and a major threat to human health. Type 2 diabetes (T2D) makes up about 90% of all cases. With the development of high-throughput sequencing technologies, more and more fundamental pathogenesis of T2D at genetic and transcriptomic levels has been revealed. The recent single-cell sequencing can further reveal the cellular heterogenicity of complex diseases in an unprecedented way. With the expectation on the molecular essence of T2D across multiple cell types, we investigated the expression profiling of more than 1,600 single cells (949 cells from T2D patients and 651 cells from normal controls) and identified the differential expression profiling and characteristics at the transcriptomics level that can distinguish such two groups of cells at the single-cell level. The expression profile was analyzed by several machine learning algorithms, including Monte Carlo feature selection, support vector machine, and repeated incremental pruning to produce error reduction (RIPPER). On one hand, some T2D-associated genes (MTND4P24, MTND2P28, and LOC100128906) were discovered. On the other hand, we revealed novel potential pathogenic mechanisms in a rule manner. They are induced by newly recognized genes and neglected by traditional bulk sequencing techniques. Particularly, the newly identified T2D genes were shown to follow specific quantitative rules with diabetes prediction potentials, and such rules further indicated several potential functional crosstalks involved in T2D.

摘要

糖尿病是最常见的疾病,也是对人类健康的重大威胁。2型糖尿病(T2D)约占所有病例的90%。随着高通量测序技术的发展,越来越多T2D在遗传和转录组水平的基本发病机制被揭示。最近的单细胞测序能够以前所未有的方式进一步揭示复杂疾病的细胞异质性。基于对跨多种细胞类型的T2D分子本质的期望,我们研究了1600多个单细胞(949个来自T2D患者的细胞和651个来自正常对照的细胞)的表达谱,并确定了在转录组水平上能够在单细胞水平区分这两组细胞的差异表达谱和特征。通过几种机器学习算法对表达谱进行分析,包括蒙特卡罗特征选择、支持向量机和重复增量剪枝以减少错误(RIPPER)。一方面,发现了一些与T2D相关的基因(MTND4P24、MTND2P28和LOC100128906)。另一方面,我们以一种规则的方式揭示了新的潜在致病机制。它们由新识别的基因诱导,而被传统的批量测序技术所忽视。特别是,新鉴定的T2D基因显示出遵循具有糖尿病预测潜力的特定定量规则,并且这些规则进一步表明了一些参与T2D的潜在功能串扰。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1f5a/9201257/01088b52eec7/fbioe-10-890901-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1f5a/9201257/4ec8f3fbe5f5/fbioe-10-890901-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1f5a/9201257/5f6f2b980661/fbioe-10-890901-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1f5a/9201257/10989e133ff6/fbioe-10-890901-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1f5a/9201257/876d229789ad/fbioe-10-890901-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1f5a/9201257/01088b52eec7/fbioe-10-890901-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1f5a/9201257/4ec8f3fbe5f5/fbioe-10-890901-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1f5a/9201257/5f6f2b980661/fbioe-10-890901-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1f5a/9201257/10989e133ff6/fbioe-10-890901-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1f5a/9201257/876d229789ad/fbioe-10-890901-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1f5a/9201257/01088b52eec7/fbioe-10-890901-g005.jpg

相似文献

1
Identification of Type 2 Diabetes Biomarkers From Mixed Single-Cell Sequencing Data With Feature Selection Methods.利用特征选择方法从混合单细胞测序数据中鉴定2型糖尿病生物标志物
Front Bioeng Biotechnol. 2022 Jun 2;10:890901. doi: 10.3389/fbioe.2022.890901. eCollection 2022.
2
Identification of leukemia stem cell expression signatures through Monte Carlo feature selection strategy and support vector machine.通过蒙特卡罗特征选择策略和支持向量机鉴定白血病干细胞表达特征。
Cancer Gene Ther. 2020 Feb;27(1-2):56-69. doi: 10.1038/s41417-019-0105-y. Epub 2019 May 29.
3
Analysis of Expression Pattern of snoRNAs in Different Cancer Types with Machine Learning Algorithms.基于机器学习算法的 snoRNAs 在不同癌症类型中的表达模式分析。
Int J Mol Sci. 2019 May 2;20(9):2185. doi: 10.3390/ijms20092185.
4
Investigating the gene expression profiles of cells in seven embryonic stages with machine learning algorithms.运用机器学习算法研究七个胚胎阶段细胞的基因表达谱。
Genomics. 2020 May;112(3):2524-2534. doi: 10.1016/j.ygeno.2020.02.004. Epub 2020 Feb 8.
5
Prediction of Weight Loss to Decrease the Risk for Type 2 Diabetes Using Multidimensional Data in Filipino Americans: Secondary Analysis.利用多维数据预测菲律宾裔美国人的体重减轻以降低2型糖尿病风险:二次分析
JMIR Diabetes. 2023 Apr 11;8:e44018. doi: 10.2196/44018.
6
Discovering Potential Taxonomic Biomarkers of Type 2 Diabetes From Human Gut Microbiota Different Feature Selection Methods.从人类肠道微生物群中发现2型糖尿病的潜在分类生物标志物 不同的特征选择方法
Front Microbiol. 2021 Aug 25;12:628426. doi: 10.3389/fmicb.2021.628426. eCollection 2021.
7
Identification of type 2 diabetes-associated combination of SNPs using support vector machine.基于支持向量机的 2 型糖尿病相关 SNP 组合鉴定。
BMC Genet. 2010 Apr 23;11:26. doi: 10.1186/1471-2156-11-26.
8
Prediction Performance of Feature Selectors and Classifiers on Highly Dimensional Transcriptomic Data for Prediction of Weight Loss in Filipino Americans at Risk for Type 2 Diabetes.对菲律宾裔美国人 2 型糖尿病风险人群转录组高维数据进行特征选择和分类器预测体重减轻的预测性能。
Biol Res Nurs. 2023 Jul;25(3):393-403. doi: 10.1177/10998004221147513. Epub 2023 Jan 4.
9
Identification of Microbiota Biomarkers With Orthologous Gene Annotation for Type 2 Diabetes.通过直系同源基因注释鉴定2型糖尿病的微生物群生物标志物
Front Microbiol. 2021 Jul 9;12:711244. doi: 10.3389/fmicb.2021.711244. eCollection 2021.
10
Vec2image: an explainable artificial intelligence model for the feature representation and classification of high-dimensional biological data by vector-to-image conversion.Vec2image:一种通过向量到图像的转换对高维生物数据进行特征表示和分类的可解释人工智能模型。
Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbab584.

引用本文的文献

1
The oxidative aging model integrated various risk factors in type 2 diabetes mellitus at system level.该氧化衰老模型在系统水平上综合了 2 型糖尿病的各种危险因素。
Front Endocrinol (Lausanne). 2023 May 24;14:1196293. doi: 10.3389/fendo.2023.1196293. eCollection 2023.
2
Current Status and Prospects of the Single-Cell Sequencing Technologies for Revealing the Pathogenesis of Pregnancy-Associated Disorders.揭示妊娠相关疾病发病机制的单细胞测序技术的现状与展望。
Genes (Basel). 2023 Mar 20;14(3):756. doi: 10.3390/genes14030756.

本文引用的文献

1
Similarity-Based Method with Multiple-Feature Sampling for Predicting Drug Side Effects.基于相似性的多特征采样方法预测药物副作用。
Comput Math Methods Med. 2022 Apr 1;2022:9547317. doi: 10.1155/2022/9547317. eCollection 2022.
2
Identification of protein functions in mouse with a label space partition method.用标签空间划分方法鉴定小鼠中的蛋白质功能。
Math Biosci Eng. 2022 Feb 10;19(4):3820-3842. doi: 10.3934/mbe.2022176.
3
Exploring the Genomic Patterns in Human and Mouse Cerebellums Via Single-Cell Sequencing and Machine Learning Method.
通过单细胞测序和机器学习方法探索人类和小鼠小脑的基因组模式。
Front Genet. 2022 Mar 4;13:857851. doi: 10.3389/fgene.2022.857851. eCollection 2022.
4
Predicting Heart Cell Types by Using Transcriptome Profiles and a Machine Learning Method.利用转录组图谱和机器学习方法预测心脏细胞类型
Life (Basel). 2022 Jan 31;12(2):228. doi: 10.3390/life12020228.
5
Predicting RNA 5-Methylcytosine Sites by Using Essential Sequence Features and Distributions.基于关键序列特征和分布预测 RNA 5-甲基胞嘧啶位点
Biomed Res Int. 2022 Jan 13;2022:4035462. doi: 10.1155/2022/4035462. eCollection 2022.
6
iMPT-FDNPL: Identification of Membrane Protein Types with Functional Domains and a Natural Language Processing Approach.iMPT-FDNPL:基于功能域和自然语言处理方法识别膜蛋白类型。
Comput Math Methods Med. 2021 Oct 11;2021:7681497. doi: 10.1155/2021/7681497. eCollection 2021.
7
Genomic Island Prediction via Chi-Square Test and Random Forest Algorithm.基于卡方检验和随机森林算法的基因组岛预测。
Comput Math Methods Med. 2021 May 24;2021:9969751. doi: 10.1155/2021/9969751. eCollection 2021.
8
Using Recursive Feature Selection with Random Forest to Improve Protein Structural Class Prediction for Low-Similarity Sequences.使用递归特征选择和随机森林提高低相似度序列的蛋白质结构分类预测。
Comput Math Methods Med. 2021 May 7;2021:5529389. doi: 10.1155/2021/5529389. eCollection 2021.
9
Identification of Protein Subcellular Localization With Network and Functional Embeddings.利用网络和功能嵌入识别蛋白质亚细胞定位
Front Genet. 2021 Jan 20;11:626500. doi: 10.3389/fgene.2020.626500. eCollection 2020.
10
Determining protein-protein functional associations by functional rules based on gene ontology and KEGG pathway.基于基因本体论和 KEGG 通路的功能规则确定蛋白质-蛋白质功能关联。
Biochim Biophys Acta Proteins Proteom. 2021 Jun;1869(6):140621. doi: 10.1016/j.bbapap.2021.140621. Epub 2021 Feb 6.