• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

序列相似性与功能相似性之间关系的定量评估。

Quantitative assessment of relationship between sequence similarity and function similarity.

作者信息

Joshi Trupti, Xu Dong

机构信息

Digital Biology Laboratory, Department of Computer Science and Christopher S, Bond Life Sciences Center, University of Missouri, Columbia, Missouri 65211, USA.

出版信息

BMC Genomics. 2007 Jul 9;8:222. doi: 10.1186/1471-2164-8-222.

DOI:10.1186/1471-2164-8-222
PMID:17620139
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1949826/
Abstract

BACKGROUND

Comparative sequence analysis is considered as the first step towards annotating new proteins in genome annotation. However, sequence comparison may lead to creation and propagation of function assignment errors. Thus, it is important to perform a thorough analysis for the quality of sequence-based function assignment using large-scale data in a systematic way.

RESULTS

We present an analysis of the relationship between sequence similarity and function similarity for the proteins in four model organisms, i.e., Arabidopsis thaliana, Saccharomyces cerevisiae, Caenorrhabditis elegans, and Drosophila melanogaster. Using a measure of functional similarity based on the three categories of Gene Ontology (GO) classifications (biological process, molecular function, and cellular component), we quantified the correlation between functional similarity and sequence similarity measured by sequence identity or statistical significance of the alignment and compared such a correlation against randomly chosen protein pairs.

CONCLUSION

Various sequence-function relationships were identified from BLAST versus PSI-BLAST, sequence identity versus Expectation Value, GO indices versus semantic similarity approaches, and within genome versus between genome comparisons, for the three GO categories. Our study provides a benchmark to estimate the confidence in assignment of functions purely based on sequence similarity.

摘要

背景

在基因组注释中,比较序列分析被视为注释新蛋白质的第一步。然而,序列比较可能会导致功能分配错误的产生和传播。因此,使用大规模数据以系统的方式对基于序列的功能分配质量进行全面分析非常重要。

结果

我们对四种模式生物(即拟南芥、酿酒酵母、秀丽隐杆线虫和黑腹果蝇)中的蛋白质序列相似性与功能相似性之间的关系进行了分析。使用基于基因本体论(GO)分类的三个类别(生物过程、分子功能和细胞成分)的功能相似性度量,我们量化了通过序列同一性或比对的统计显著性测量的功能相似性与序列相似性之间的相关性,并将这种相关性与随机选择的蛋白质对进行了比较。

结论

对于三个GO类别,从BLAST与PSI-BLAST、序列同一性与期望值、GO索引与语义相似性方法以及基因组内与基因组间比较中识别出了各种序列-功能关系。我们的研究提供了一个基准,用于估计仅基于序列相似性进行功能分配的置信度。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfc0/1949826/d1d581a274a4/1471-2164-8-222-13.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfc0/1949826/56a13b4d95c9/1471-2164-8-222-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfc0/1949826/91f8c7378ae0/1471-2164-8-222-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfc0/1949826/a56761dfeece/1471-2164-8-222-9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfc0/1949826/f5bf0b9b8651/1471-2164-8-222-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfc0/1949826/a0491d52b8af/1471-2164-8-222-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfc0/1949826/3f108a73fa4e/1471-2164-8-222-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfc0/1949826/5a49b0cbb339/1471-2164-8-222-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfc0/1949826/47bc82fa52ea/1471-2164-8-222-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfc0/1949826/2fee73bf278c/1471-2164-8-222-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfc0/1949826/02d0381ccdbe/1471-2164-8-222-10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfc0/1949826/d82283387c7e/1471-2164-8-222-11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfc0/1949826/7a042a724102/1471-2164-8-222-12.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfc0/1949826/d1d581a274a4/1471-2164-8-222-13.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfc0/1949826/56a13b4d95c9/1471-2164-8-222-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfc0/1949826/91f8c7378ae0/1471-2164-8-222-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfc0/1949826/a56761dfeece/1471-2164-8-222-9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfc0/1949826/f5bf0b9b8651/1471-2164-8-222-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfc0/1949826/a0491d52b8af/1471-2164-8-222-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfc0/1949826/3f108a73fa4e/1471-2164-8-222-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfc0/1949826/5a49b0cbb339/1471-2164-8-222-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfc0/1949826/47bc82fa52ea/1471-2164-8-222-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfc0/1949826/2fee73bf278c/1471-2164-8-222-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfc0/1949826/02d0381ccdbe/1471-2164-8-222-10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfc0/1949826/d82283387c7e/1471-2164-8-222-11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfc0/1949826/7a042a724102/1471-2164-8-222-12.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfc0/1949826/d1d581a274a4/1471-2164-8-222-13.jpg

相似文献

1
Quantitative assessment of relationship between sequence similarity and function similarity.序列相似性与功能相似性之间关系的定量评估。
BMC Genomics. 2007 Jul 9;8:222. doi: 10.1186/1471-2164-8-222.
2
Assessing annotation transfer for genomics: quantifying the relations between protein sequence, structure and function through traditional and probabilistic scores.评估基因组学中的注释转移:通过传统分数和概率分数量化蛋白质序列、结构与功能之间的关系。
J Mol Biol. 2000 Mar 17;297(1):233-49. doi: 10.1006/jmbi.2000.3550.
3
GOtcha: a new method for prediction of protein function assessed by the annotation of seven genomes.GOtcha:一种通过七个基因组注释评估蛋白质功能预测的新方法。
BMC Bioinformatics. 2004 Nov 18;5:178. doi: 10.1186/1471-2105-5-178.
4
AVID: an integrative framework for discovering functional relationships among proteins.AVID:一个用于发现蛋白质间功能关系的综合框架。
BMC Bioinformatics. 2005 Jun 1;6:136. doi: 10.1186/1471-2105-6-136.
5
Annotation transfer between genomes: protein-protein interologs and protein-DNA regulogs.基因组间的注释转移:蛋白质-蛋白质间源相似物和蛋白质-DNA调控同源物。
Genome Res. 2004 Jun;14(6):1107-18. doi: 10.1101/gr.1774904.
6
Using indirect protein interactions for the prediction of Gene Ontology functions.利用间接蛋白质相互作用预测基因本体功能。
BMC Bioinformatics. 2007 May 22;8 Suppl 4(Suppl 4):S8. doi: 10.1186/1471-2105-8-S4-S8.
7
Efficient recognition of protein fold at low sequence identity by conservative application of Psi-BLAST: application.通过保守应用Psi-BLAST在低序列同一性下高效识别蛋白质折叠:应用
J Mol Recognit. 2005 Mar-Apr;18(2):150-7. doi: 10.1002/jmr.719.
8
Function-dependent clustering of orthologues and paralogues of cyclophilins.亲环蛋白直系同源物和旁系同源物的功能依赖性聚类
Proteins. 2004 Sep 1;56(4):808-20. doi: 10.1002/prot.20156.
9
Fast model-based protein homology detection without alignment.基于快速模型的无需比对的蛋白质同源性检测。
Bioinformatics. 2007 Jul 15;23(14):1728-36. doi: 10.1093/bioinformatics/btm247. Epub 2007 May 8.
10
Benchmarking PSI-BLAST in genome annotation.在基因组注释中对PSI-BLAST进行基准测试。
J Mol Biol. 1999 Nov 12;293(5):1257-71. doi: 10.1006/jmbi.1999.3233.

引用本文的文献

1
Integrated multi-omics analysis and experimental investigation of mitochondrial dynamics-related genes: molecular subtypes, immune landscape, and prognostic implications in lung adenocarcinoma.线粒体动力学相关基因的综合多组学分析与实验研究:肺腺癌的分子亚型、免疫格局及预后意义
Front Immunol. 2025 May 29;16:1585505. doi: 10.3389/fimmu.2025.1585505. eCollection 2025.
2
Genetic Variation and Gene Expression of the Antimicrobial Peptide Macins in Asian Buffalo Leech ().亚洲水牛蛭抗菌肽Macins的遗传变异与基因表达() 。 (你提供的原文括号部分不完整,请检查补充完整后以便更准确理解和翻译)
Biology (Basel). 2025 May 8;14(5):517. doi: 10.3390/biology14050517.
3

本文引用的文献

1
Genome-scale gene function prediction using multiple sources of high-throughput data in yeast Saccharomyces cerevisiae.利用多种高通量数据来源对酿酒酵母进行全基因组规模的基因功能预测。
OMICS. 2004 Winter;8(4):322-33. doi: 10.1089/omi.2004.8.322.
2
Global protein function annotation through mining genome-scale data in yeast Saccharomyces cerevisiae.通过挖掘酿酒酵母基因组规模数据进行全球蛋白质功能注释。
Nucleic Acids Res. 2004 Dec 7;32(21):6414-24. doi: 10.1093/nar/gkh978. Print 2004.
3
Prediction of protein function from protein sequence and structure.
TopEC: prediction of Enzyme Commission classes by 3D graph neural networks and localized 3D protein descriptor.
TopEC:利用三维图神经网络和局部三维蛋白质描述符预测酶委员会类别
Nat Commun. 2025 Mar 20;16(1):2737. doi: 10.1038/s41467-025-57324-5.
4
Cognitive Impact of Neurotropic Pathogens: Investigating Molecular Mimicry through Computational Methods.神经亲和病原体的认知影响:通过计算方法研究分子拟态。
Cell Mol Neurobiol. 2024 Oct 29;44(1):72. doi: 10.1007/s10571-024-01509-x.
5
Multi-omics profiling and experimental verification of tertiary lymphoid structure-related genes: molecular subgroups, immune infiltration, and prognostic implications in lung adenocarcinoma.多组学分析和三级淋巴结构相关基因的实验验证:肺腺癌的分子亚群、免疫浸润和预后意义。
Front Immunol. 2024 Sep 19;15:1453220. doi: 10.3389/fimmu.2024.1453220. eCollection 2024.
6
Comparative proteomic profiling of the ovine and human PBMC inflammatory response.绵羊和人外周血单核细胞炎症反应的比较蛋白质组学分析。
Sci Rep. 2024 Jun 28;14(1):14939. doi: 10.1038/s41598-024-66059-0.
7
Genome mining of : targeting SufD as a novel drug candidate through characterization and inhibitor screening.关于……的基因组挖掘:通过表征和抑制剂筛选将SufD作为新型药物候选靶点
Front Microbiol. 2024 Apr 15;15:1369645. doi: 10.3389/fmicb.2024.1369645. eCollection 2024.
8
Comparative genomics reveals insight into the phylogeny and habitat adaptation of novel species, an endophytic actinomycete associated with scab lesions on potato tubers.比较基因组学揭示了对一种与马铃薯块茎疮痂病斑相关的新型内生放线菌新物种的系统发育和栖息地适应性的见解。
Front Plant Sci. 2024 Mar 27;15:1346574. doi: 10.3389/fpls.2024.1346574. eCollection 2024.
9
Comparative Bioinformatic Analysis of the Proteomes of Rabbit and Human Sex Chromosomes.兔和人类性染色体蛋白质组的比较生物信息学分析
Animals (Basel). 2024 Jan 9;14(2):217. doi: 10.3390/ani14020217.
10
SLC7A11, a potential immunotherapeutic target in lung adenocarcinoma.SLC7A11,肺腺癌的潜在免疫治疗靶点。
Sci Rep. 2023 Oct 25;13(1):18302. doi: 10.1038/s41598-023-45284-z.
从蛋白质序列和结构预测蛋白质功能。
Q Rev Biophys. 2003 Aug;36(3):307-40. doi: 10.1017/s0033583503003901.
4
DBSubLoc: database of protein subcellular localization.DBSubLoc:蛋白质亚细胞定位数据库。
Nucleic Acids Res. 2004 Jan 1;32(Database issue):D122-4. doi: 10.1093/nar/gkh109.
5
Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation.研究基因本体中语义相似性度量:序列与注释之间的关系。
Bioinformatics. 2003 Jul 1;19(10):1275-83. doi: 10.1093/bioinformatics/btg153.
6
Support vector machine approach for protein subcellular localization prediction.用于蛋白质亚细胞定位预测的支持向量机方法
Bioinformatics. 2001 Aug;17(8):721-8. doi: 10.1093/bioinformatics/17.8.721.
7
Issues in predicting protein function from sequence.从序列预测蛋白质功能的相关问题。
Brief Bioinform. 2001 Mar;2(1):19-29. doi: 10.1093/bib/2.1.19.
8
From genome to function.从基因组到功能。
Science. 2001 Jun 15;292(5524):2095-7. doi: 10.1126/science.292.5524.2095.
9
Computational genomics.计算基因组学
Curr Biol. 2001 Mar 6;11(5):R155-8. doi: 10.1016/s0960-9822(01)00081-1.
10
Practical limits of function prediction.功能预测的实际局限性。
Proteins. 2000 Oct 1;41(1):98-107.