• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用分子谱分析技术发现生物标志物的数据驱动分析方法。

Data-driven analysis approach for biomarker discovery using molecular-profiling technologies.

作者信息

Wei T, Liao B, Ackermann B L, Jolly R A, Eckstein J A, Kulkarni N H, Helvering L M, Goldstein K M, Shou J, Estrem S T, Ryan T P, Colet J-M, Thomas C E, Stevens J L, Onyia J E

机构信息

Integrative Biology, Lilly Research Laboratories, Greenfield, IN 46140, USA.

出版信息

Biomarkers. 2005 Mar-Jun;10(2-3):153-72. doi: 10.1080/13547500500107430.

DOI:10.1080/13547500500107430
PMID:16076730
Abstract

High-throughput molecular-profiling technologies provide rapid, efficient and systematic approaches to search for biomarkers. Supervised learning algorithms are naturally suited to analyse a large amount of data generated using these technologies in biomarker discovery efforts. The study demonstrates with two examples a data-driven analysis approach to analysis of large complicated datasets collected in high-throughput technologies in the context of biomarker discovery. The approach consists of two analytic steps: an initial unsupervised analysis to obtain accurate knowledge about sample clustering, followed by a second supervised analysis to identify a small set of putative biomarkers for further experimental characterization. By comparing the most widely applied clustering algorithms using a leukaemia DNA microarray dataset, it was established that principal component analysis-assisted projections of samples from a high-dimensional molecular feature space into a few low dimensional subspaces provides a more effective and accurate way to explore visually and identify data structures that confirm intended experimental effects based on expected group membership. A supervised analysis method, shrunken centroid algorithm, was chosen to take knowledge of sample clustering gained or confirmed by the first step of the analysis to identify a small set of molecules as candidate biomarkers for further experimentation. The approach was applied to two molecular-profiling studies. In the first study, PCA-assisted analysis of DNA microarray data revealed that discrete data structures exist in rat liver gene expression and correlated with blood clinical chemistry and liver pathological damage in response to a chemical toxicant diethylhexylphthalate, a peroxisome-proliferator-activator receptor agonist. Sixteen genes were then identified by shrunken centroid algorithm as the best candidate biomarkers for liver damage. Functional annotations of these genes revealed roles in acute phase response, lipid and fatty acid metabolism and they are functionally relevant to the observed toxicities. In the second study, 26 urine ions identified from a GC/MS spectrum, two of which were glucose fragment ions included as positive controls, showed robust changes with the development of diabetes in Zucker diabetic fatty rats. Further experiments are needed to define their chemical identities and establish functional relevancy to disease development.

摘要

高通量分子谱分析技术为寻找生物标志物提供了快速、高效且系统的方法。监督学习算法天然适用于在生物标志物发现工作中分析使用这些技术生成的大量数据。该研究通过两个例子展示了一种数据驱动的分析方法,用于在生物标志物发现背景下分析高通量技术收集的大型复杂数据集。该方法包括两个分析步骤:首先进行无监督分析以获取关于样本聚类的准确知识,然后进行第二次监督分析以识别一小部分假定的生物标志物,用于进一步的实验表征。通过使用白血病DNA微阵列数据集比较最广泛应用的聚类算法,确定了将样本从高维分子特征空间主成分分析辅助投影到几个低维子空间,能提供一种更有效、准确的方式来直观探索和识别基于预期组成员身份确认预期实验效果的数据结构。选择一种监督分析方法——收缩质心算法,利用分析第一步获得或确认的样本聚类知识,识别一小部分分子作为进一步实验的候选生物标志物。该方法应用于两项分子谱分析研究。在第一项研究中,对DNA微阵列数据进行主成分分析辅助分析发现,大鼠肝脏基因表达中存在离散的数据结构,且与化学毒物邻苯二甲酸二异辛酯(一种过氧化物酶体增殖物激活受体激动剂)诱导的血液临床化学指标及肝脏病理损伤相关。然后通过收缩质心算法确定了16个基因作为肝脏损伤的最佳候选生物标志物。这些基因的功能注释揭示了它们在急性期反应、脂质和脂肪酸代谢中的作用,并且在功能上与观察到的毒性相关。在第二项研究中,从气相色谱/质谱谱图中鉴定出26种尿液离子,其中两种葡萄糖碎片离子作为阳性对照,随着Zucker糖尿病肥胖大鼠糖尿病的发展呈现出显著变化。需要进一步实验来确定它们的化学身份,并建立与疾病发展的功能相关性。

相似文献

1
Data-driven analysis approach for biomarker discovery using molecular-profiling technologies.使用分子谱分析技术发现生物标志物的数据驱动分析方法。
Biomarkers. 2005 Mar-Jun;10(2-3):153-72. doi: 10.1080/13547500500107430.
2
Translational Metabolomics of Head Injury: Exploring Dysfunctional Cerebral Metabolism with Ex Vivo NMR Spectroscopy-Based Metabolite Quantification头部损伤的转化代谢组学:基于体外核磁共振波谱的代谢物定量分析探索脑代谢功能障碍
3
Pooling samples within microarray studies: a comparative analysis of rat liver transcription response to prototypical toxicants.微阵列研究中的样本合并:大鼠肝脏对典型毒物转录反应的比较分析。
Physiol Genomics. 2005 Aug 11;22(3):346-55. doi: 10.1152/physiolgenomics.00260.2004. Epub 2005 May 24.
4
Feature selection and nearest centroid classification for protein mass spectrometry.蛋白质质谱的特征选择与最近质心分类
BMC Bioinformatics. 2005 Mar 23;6:68. doi: 10.1186/1471-2105-6-68.
5
Prediction of compound signature using high density gene expression profiling.利用高密度基因表达谱预测化合物特征
Toxicol Sci. 2002 Jun;67(2):232-40. doi: 10.1093/toxsci/67.2.232.
6
Unsupervised learning from complex data: the matrix incision tree algorithm.从复杂数据中进行无监督学习:矩阵切割树算法。
Pac Symp Biocomput. 2001:30-41. doi: 10.1142/9789814447362_0004.
7
Functional genomics and proteomics in the clinical neurosciences: data mining and bioinformatics.临床神经科学中的功能基因组学和蛋白质组学:数据挖掘与生物信息学
Prog Brain Res. 2006;158:83-108. doi: 10.1016/S0079-6123(06)58004-5.
8
Many accurate small-discriminatory feature subsets exist in microarray transcript data: biomarker discovery.微阵列转录数据中存在许多准确的小判别特征子集:生物标志物发现。
BMC Bioinformatics. 2005 Apr 13;6:97. doi: 10.1186/1471-2105-6-97.
9
Gas chromatography-mass spectrometry-based profiling of serum fatty acids in acetaminophen-induced liver injured rats.基于气相色谱-质谱联用技术的乙酰氨基酚诱导肝损伤大鼠血清脂肪酸图谱分析。
J Appl Toxicol. 2014 Feb;34(2):149-57. doi: 10.1002/jat.2844. Epub 2012 Dec 12.
10
A Multiplatform Approach for the Discovery of Novel Drug-Induced Kidney Injury Biomarkers.一种用于发现新型药物性肾损伤生物标志物的多平台方法。
Chem Res Toxicol. 2017 Oct 16;30(10):1823-1834. doi: 10.1021/acs.chemrestox.7b00159. Epub 2017 Sep 27.

引用本文的文献

1
Highly Multiplexed Tissue Imaging in Precision Oncology and Translational Cancer Research.精准肿瘤学和转化癌症研究中的高多重组织成像。
Cancer Discov. 2024 Nov 1;14(11):2071-2088. doi: 10.1158/2159-8290.CD-23-1165.
2
A semiautomated framework for integrating expert knowledge into disease marker identification.一种将专家知识集成到疾病标志物识别中的半自动化框架。
Dis Markers. 2013;35(5):513-23. doi: 10.1155/2013/613529. Epub 2013 Oct 10.
3
Data processing and classification analysis of proteomic changes: a case study of oil pollution in the mussel, Mytilus edulis.
蛋白质组变化的数据处理与分类分析:以紫贻贝(Mytilus edulis)油污污染为例的研究
Proteome Sci. 2006 Sep 13;4(1):17. doi: 10.1186/1477-5956-4-17.