• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Framework for kernel regularization with application to protein clustering.用于蛋白质聚类的核正则化框架。
Proc Natl Acad Sci U S A. 2005 Aug 30;102(35):12332-7. doi: 10.1073/pnas.0505411102. Epub 2005 Aug 18.
2
Incorporating homologues into sequence embeddings for protein analysis.将同源物纳入用于蛋白质分析的序列嵌入中。
J Bioinform Comput Biol. 2007 Jun;5(3):717-38. doi: 10.1142/s0219720007002734.
3
A discriminative framework for detecting remote protein homologies.一种用于检测远程蛋白质同源性的判别框架。
J Comput Biol. 2000 Feb-Apr;7(1-2):95-114. doi: 10.1089/10665270050081405.
4
Hidden Markov models incorporating fuzzy measures and integrals for protein sequence identification and alignment.结合模糊测度与积分的隐马尔可夫模型用于蛋白质序列识别与比对
Genomics Proteomics Bioinformatics. 2008 Jun;6(2):98-110. doi: 10.1016/S1672-0229(08)60025-X.
5
Characterizing conserved structural contacts by pair-wise relative contacts and relative packing groups.通过成对相对接触和相对堆积基团来表征保守的结构接触。
J Mol Biol. 2005 Dec 2;354(3):706-21. doi: 10.1016/j.jmb.2005.09.081. Epub 2005 Oct 18.
6
Encoding Dissimilarity Data for Statistical Model Building.为统计模型构建对差异数据进行编码。
J Stat Plan Inference. 2010 Dec 1;140(12):3580-3596. doi: 10.1016/j.jspi.2010.04.025.
7
Optimal pairwise alignment of fixed protein structures in subquadratic time.在亚二次时间内对固定蛋白质结构进行最优成对比对。
J Bioinform Comput Biol. 2011 Jun;9(3):367-82. doi: 10.1142/s0219720011005562.
8
Using CLUSTAL for multiple sequence alignments.使用CLUSTAL进行多序列比对。
Methods Enzymol. 1996;266:383-402. doi: 10.1016/s0076-6879(96)66024-8.
9
Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space.用于对海量数据集进行精确层次聚类的高效算法:攻克整个蛋白质空间
Bioinformatics. 2008 Jul 1;24(13):i41-9. doi: 10.1093/bioinformatics/btn174.
10
Fast Gaussian kernel learning for classification tasks based on specially structured global optimization.基于特殊结构全局优化的分类任务快速高斯核学习。
Neural Netw. 2014 Sep;57:51-62. doi: 10.1016/j.neunet.2014.05.014. Epub 2014 Jun 2.

引用本文的文献

1
An analysis of classical multidimensional scaling with applications to clustering.经典多维缩放分析及其在聚类中的应用。
Inf inference. 2022 Apr 23;12(1):72-112. doi: 10.1093/imaiai/iaac004. eCollection 2023 Mar.
2
Penalized nonparametric scalar-on-function regression via principal coordinates.通过主坐标进行惩罚非参数函数标量回归
J Comput Graph Stat. 2017;26(3):569-578. doi: 10.1080/10618600.2016.1217227. Epub 2016 Aug 2.
3
Discriminative Prediction of A-To-I RNA Editing Events from DNA Sequence.从DNA序列中对A到I RNA编辑事件进行判别预测。
PLoS One. 2016 Oct 20;11(10):e0164962. doi: 10.1371/journal.pone.0164962. eCollection 2016.
4
Kinase Identification with Supervised Laplacian Regularized Least Squares.基于监督拉普拉斯正则化最小二乘法的激酶识别
PLoS One. 2015 Oct 8;10(10):e0139676. doi: 10.1371/journal.pone.0139676. eCollection 2015.
5
Backward multiple imputation estimation of the conditional lifetime expectancy function with application to censored human longevity data.条件预期寿命函数的反向多重填补估计及其在删失人类长寿数据中的应用。
Proc Natl Acad Sci U S A. 2015 Sep 29;112(39):12069-74. doi: 10.1073/pnas.1512237112. Epub 2015 Sep 14.
6
Using distance correlation and SS-ANOVA to assess associations of familial relationships, lifestyle factors, diseases, and mortality.使用距离相关系数和 SS-ANOVA 评估家族关系、生活方式因素、疾病和死亡率之间的关联。
Proc Natl Acad Sci U S A. 2012 Dec 11;109(50):20352-7. doi: 10.1073/pnas.1217269109. Epub 2012 Nov 21.
7
Multidimensional scaling reveals the main evolutionary pathways of class A G-protein-coupled receptors.多维标度揭示了 A 类 G 蛋白偶联受体的主要进化途径。
PLoS One. 2011 Apr 22;6(4):e19094. doi: 10.1371/journal.pone.0019094.
8
Improving the quality of protein similarity network clustering algorithms using the network edge weight distribution.利用网络边权重分布改进蛋白质相似性网络聚类算法的质量。
Bioinformatics. 2011 Feb 1;27(3):326-33. doi: 10.1093/bioinformatics/btq655. Epub 2010 Nov 29.
9
Encoding Dissimilarity Data for Statistical Model Building.为统计模型构建对差异数据进行编码。
J Stat Plan Inference. 2010 Dec 1;140(12):3580-3596. doi: 10.1016/j.jspi.2010.04.025.
10
Examining the relative influence of familial, genetic, and environmental covariate information in flexible risk models.研究家族、遗传和环境协变量信息在灵活风险模型中的相对影响。
Proc Natl Acad Sci U S A. 2009 May 19;106(20):8128-33. doi: 10.1073/pnas.0902906106. Epub 2009 May 6.

本文引用的文献

1
Supervised detection of regulatory motifs in DNA sequences.DNA序列中调控基序的监督检测。
Stat Appl Genet Mol Biol. 2003;2:Article5. doi: 10.2202/1544-6115.1015. Epub 2003 Aug 25.
2
Global mapping of the protein structure space and application in structure-based inference of protein function.蛋白质结构空间的全球图谱及其在基于结构的蛋白质功能推断中的应用。
Proc Natl Acad Sci U S A. 2005 Mar 8;102(10):3651-6. doi: 10.1073/pnas.0409772102. Epub 2005 Feb 10.
3
Bioconductor: open software development for computational biology and bioinformatics.生物导体:用于计算生物学和生物信息学的开源软件开发。
Genome Biol. 2004;5(10):R80. doi: 10.1186/gb-2004-5-10-r80. Epub 2004 Sep 15.
4
Efficient quadratic regularization for expression arrays.用于表达阵列的高效二次正则化
Biostatistics. 2004 Jul;5(3):329-40. doi: 10.1093/biostatistics/5.3.329.
5
Mismatch string kernels for discriminative protein classification.用于判别式蛋白质分类的错配字符串核
Bioinformatics. 2004 Mar 1;20(4):467-76. doi: 10.1093/bioinformatics/btg431. Epub 2004 Jan 22.
6
Combining pairwise sequence similarity and support vector machines for detecting remote protein evolutionary and structural relationships.结合成对序列相似性和支持向量机来检测远距离蛋白质进化和结构关系。
J Comput Biol. 2003;10(6):857-68. doi: 10.1089/106652703322756113.
7
JASPAR: an open-access database for eukaryotic transcription factor binding profiles.JASPAR:一个用于真核转录因子结合图谱的开放获取数据库。
Nucleic Acids Res. 2004 Jan 1;32(Database issue):D91-4. doi: 10.1093/nar/gkh012.
8
On the role of structural information in remote homology detection and sequence alignment: new methods using hybrid sequence profiles.论结构信息在远程同源性检测和序列比对中的作用:使用混合序列谱的新方法
J Mol Biol. 2003 Dec 12;334(5):1043-62. doi: 10.1016/j.jmb.2003.10.025.
9
Remote homology detection: a motif based approach.远程同源性检测:一种基于基序的方法。
Bioinformatics. 2003;19 Suppl 1:i26-33. doi: 10.1093/bioinformatics/btg1002.
10
ExPASy: The proteomics server for in-depth protein knowledge and analysis.ExPASy:用于深入蛋白质知识和分析的蛋白质组学服务器。
Nucleic Acids Res. 2003 Jul 1;31(13):3784-8. doi: 10.1093/nar/gkg563.

用于蛋白质聚类的核正则化框架。

Framework for kernel regularization with application to protein clustering.

作者信息

Lu Fan, Keles Sündüz, Wright Stephen J, Wahba Grace

机构信息

Department of Statistics, University of Wisconsin, Madison, WI 53706, USA.

出版信息

Proc Natl Acad Sci U S A. 2005 Aug 30;102(35):12332-7. doi: 10.1073/pnas.0505411102. Epub 2005 Aug 18.

DOI:10.1073/pnas.0505411102
PMID:16109767
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1187947/
Abstract

We develop and apply a previously undescribed framework that is designed to extract information in the form of a positive definite kernel matrix from possibly crude, noisy, incomplete, inconsistent dissimilarity information between pairs of objects, obtainable in a variety of contexts. Any positive definite kernel defines a consistent set of distances, and the fitted kernel provides a set of coordinates in Euclidean space that attempts to respect the information available while controlling for complexity of the kernel. The resulting set of coordinates is highly appropriate for visualization and as input to classification and clustering algorithms. The framework is formulated in terms of a class of optimization problems that can be solved efficiently by using modern convex cone programming software. The power of the method is illustrated in the context of protein clustering based on primary sequence data. An application to the globin family of proteins resulted in a readily visualizable 3D sequence space of globins, where several subfamilies and subgroupings consistent with the literature were easily identifiable.

摘要

我们开发并应用了一个此前未被描述的框架,该框架旨在从成对对象之间可能粗糙、有噪声、不完整、不一致的差异信息中提取正定核矩阵形式的信息,这些信息可在各种情况下获取。任何正定核都定义了一组一致的距离,拟合的核提供了欧几里得空间中的一组坐标,该坐标试图在控制核的复杂性的同时尊重可用信息。所得的坐标集非常适合用于可视化,以及作为分类和聚类算法的输入。该框架是根据一类优化问题制定的,可通过使用现代凸锥规划软件有效地求解。该方法的威力在基于一级序列数据的蛋白质聚类背景下得到了说明。对球蛋白家族蛋白质的应用产生了一个易于可视化的球蛋白三维序列空间,其中与文献一致的几个亚家族和亚分组很容易识别。