• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

超高维多组学数据的联合筛选

Joint Screening for Ultra-High Dimensional Multi-Omics Data.

作者信息

Kemmo Tsafack Ulrich, Lin Chien-Wei, Ahn Kwang Woo

机构信息

Division of Biostatistics, Medical College of Wisconsin (MCW), Milwaukee, WI 53226, USA.

出版信息

Bioengineering (Basel). 2024 Nov 25;11(12):1193. doi: 10.3390/bioengineering11121193.

DOI:10.3390/bioengineering11121193
PMID:39768011
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11727280/
Abstract

Investigators often face ultra-high dimensional multi-omics data, where identifying significant genes and omics within a gene is of interest. In such data, each gene forms a group consisting of its multiple omics. Moreover, some genes may also be highly correlated. This leads to a tri-level hierarchical structured data: the cluster level, which is the group of correlated genes, the subgroup level, which is the group of omics of the same gene, and the individual level, which consists of omics. Screening is widely used to remove unimportant variables so that the number of remaining variables becomes smaller than the sample size. Penalized regression with the remaining variables after performing screening is then used to identify important variables. To screen unimportant genes, we propose to cluster genes and conduct screening. We show that the proposed screening method possesses the sure screening property. Extensive simulations show that the proposed screening method outperforms competing methods. We apply the proposed variable selection method to the TCGA breast cancer dataset to identify genes and omics that are related to breast cancer.

摘要

研究人员经常面临超高维多组学数据,其中识别重要基因以及基因内的组学信息是研究的重点。在这类数据中,每个基因形成一个由其多个组学组成的组。此外,一些基因可能也高度相关。这导致了一种三级层次结构数据:聚类层次,即相关基因的组;子组层次,即同一基因的组学的组;个体层次,由组学组成。筛选被广泛用于去除不重要的变量,以使剩余变量的数量小于样本量。然后,对筛选后剩余的变量进行惩罚回归,以识别重要变量。为了筛选不重要的基因,我们建议对基因进行聚类并进行筛选。我们表明,所提出的筛选方法具有确定筛选性质。大量模拟表明,所提出的筛选方法优于其他竞争方法。我们将所提出的变量选择方法应用于TCGA乳腺癌数据集,以识别与乳腺癌相关的基因和组学信息。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4c46/11727280/cf2c7ffbda8c/bioengineering-11-01193-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4c46/11727280/e1f75a89dcc3/bioengineering-11-01193-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4c46/11727280/c4240ba62184/bioengineering-11-01193-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4c46/11727280/cf2c7ffbda8c/bioengineering-11-01193-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4c46/11727280/e1f75a89dcc3/bioengineering-11-01193-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4c46/11727280/c4240ba62184/bioengineering-11-01193-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4c46/11727280/cf2c7ffbda8c/bioengineering-11-01193-g003.jpg

相似文献

1
Joint Screening for Ultra-High Dimensional Multi-Omics Data.超高维多组学数据的联合筛选
Bioengineering (Basel). 2024 Nov 25;11(12):1193. doi: 10.3390/bioengineering11121193.
2
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
3
Integrative Analysis of Multi-Omics Data Based on Blockwise Sparse Principal Components.基于分块稀疏主成分的多组学数据综合分析。
Int J Mol Sci. 2020 Nov 2;21(21):8202. doi: 10.3390/ijms21218202.
4
The Sparse MLE for Ultra-High-Dimensional Feature Screening.超高维特征筛选的稀疏极大似然估计
J Am Stat Assoc. 2014;109(507):1257-1269. doi: 10.1080/01621459.2013.879531.
5
Prognosis of lasso-like penalized Cox models with tumor profiling improves prediction over clinical data alone and benefits from bi-dimensional pre-screening.具有肿瘤特征分析的套索惩罚 Cox 模型的预后可提高预测准确性,优于仅使用临床数据的预测,并且受益于二维预筛选。
BMC Cancer. 2022 Oct 5;22(1):1045. doi: 10.1186/s12885-022-10117-1.
6
A penalized linear mixed model with generalized method of moments for prediction analysis on high-dimensional multi-omics data.基于广义矩方法的惩罚线性混合模型在高维多组学数据预测分析中的应用。
Brief Bioinform. 2022 Jul 18;23(4). doi: 10.1093/bib/bbac193.
7
Integrative clustering of multi-level omics data for disease subtype discovery using sequential double regularization.使用序贯双重正则化对多组学数据进行整合聚类以发现疾病亚型
Biostatistics. 2017 Jan;18(1):165-179. doi: 10.1093/biostatistics/kxw039. Epub 2016 Aug 22.
8
Multiset sparse partial least squares path modeling for high dimensional omics data analysis.多集稀疏偏最小二乘路径建模在高维组学数据分析中的应用。
BMC Bioinformatics. 2020 Jan 9;21(1):9. doi: 10.1186/s12859-019-3286-3.
9
PathME: pathway based multi-modal sparse autoencoders for clustering of patient-level multi-omics data.PathME:基于通路的多模态稀疏自动编码器,用于对患者层面多组学数据进行聚类。
BMC Bioinformatics. 2020 Apr 16;21(1):146. doi: 10.1186/s12859-020-3465-2.
10
Integrative, multi-omics, analysis of blood samples improves model predictions: applications to cancer.整合多组学分析血液样本可改善模型预测:在癌症中的应用。
BMC Bioinformatics. 2021 Aug 5;22(1):395. doi: 10.1186/s12859-021-04296-0.

本文引用的文献

1
Meta-Analytic Gene-Clustering Algorithm for Integrating Multi-Omics and Multi-Study Data.用于整合多组学和多研究数据的元分析基因聚类算法
Bioengineering (Basel). 2024 Jun 8;11(6):587. doi: 10.3390/bioengineering11060587.
2
CWH43 Is a Novel Tumor Suppressor Gene with Negative Regulation of TTK in Colorectal Cancer.CWH43 是一种新型的结直肠癌肿瘤抑制基因,对 TTK 具有负调控作用。
Int J Mol Sci. 2023 Oct 17;24(20):15262. doi: 10.3390/ijms242015262.
3
SLC9A2, suppressing by the transcription suppressor ETS1, restrains growth and invasion of osteosarcoma via inhibition of aerobic glycolysis.
SLC9A2 通过转录抑制因子 ETS1 的抑制作用,抑制有氧糖酵解,从而抑制骨肉瘤的生长和侵袭。
Environ Toxicol. 2024 Jan;39(1):238-251. doi: 10.1002/tox.23963. Epub 2023 Sep 9.
4
G-protein coupled receptor 5C (GPRC5C) is required for osteoblast differentiation and responds to EZH2 inhibition and multiple osteogenic signals.G 蛋白偶联受体 5C(GPRC5C)是成骨细胞分化所必需的,并且对 EZH2 抑制和多种成骨信号有反应。
Bone. 2023 Nov;176:116866. doi: 10.1016/j.bone.2023.116866. Epub 2023 Aug 7.
5
Aggregation tests identify new gene associations with breast cancer in populations with diverse ancestry.聚集测试可在具有不同祖先背景的人群中识别与乳腺癌相关的新基因关联。
Genome Med. 2023 Jan 26;15(1):7. doi: 10.1186/s13073-022-01152-5.
6
The sodium channel subunit SCNN1B suppresses colorectal cancer via suppression of active c-Raf and MAPK signaling cascade.钠离子通道亚基 SCNN1B 通过抑制活性 c-Raf 和 MAPK 信号级联来抑制结直肠癌。
Oncogene. 2023 Feb;42(8):601-612. doi: 10.1038/s41388-022-02576-4. Epub 2022 Dec 23.
7
Prognostic Signature and Therapeutic Value Based on Membrane Lipid Biosynthesis-Related Genes in Breast Cancer.基于膜脂生物合成相关基因的乳腺癌预后特征及治疗价值
J Oncol. 2022 Aug 25;2022:7204415. doi: 10.1155/2022/7204415. eCollection 2022.
8
CUX2/KDM5B/SOX17 Axis Affects the Occurrence and Development of Breast Cancer.CUX2/KDM5B/SOX17 轴影响乳腺癌的发生和发展。
Endocrinology. 2022 Sep 1;163(9). doi: 10.1210/endocr/bqac110.
9
Progesterone activates GPR126 to promote breast cancer development via the Gi pathway.孕激素通过 Gi 通路激活 GPR126 促进乳腺癌发展。
Proc Natl Acad Sci U S A. 2022 Apr 12;119(15):e2117004119. doi: 10.1073/pnas.2117004119. Epub 2022 Apr 8.
10
Regulation of a Novel Splice Variant of Early Growth Response 4 (EGR4-S) by HER+ Signalling and HSF1 in Breast Cancer.HER+信号传导和HSF1对乳腺癌中早期生长反应4的一种新型剪接变体(EGR4-S)的调控
Cancers (Basel). 2022 Mar 18;14(6):1567. doi: 10.3390/cancers14061567.