Suppr超能文献

HYPO:人类假定蛋白质数据库。

HYPO: A Database of Human Hypothetical Proteins.

作者信息

Sundararajan Vijayaraghava S, Malik Girik, Ijaq Johny, Kumar Anuj, Das Partha Sarathi, P R Shidhi, Nair Achuthsankar S, Dhar Pawan K, Suravajhala Prashanth

机构信息

Bioclues.org, Kukatpally, Hyderabad 500072, India.

Environmental Health Institute, National Environment Agency, 11 Bio polis Way, #04-03/04, Helios Block, Singapore.

出版信息

Protein Pept Lett. 2018;25(8):799-803. doi: 10.2174/0929866525666180828110444.

Abstract

BACKGROUND

There are genes whose function remains obscure as they may not have similarities to known regions in the genome. Such known 'unknown' genes constituting the Open Reading Frames (ORF) that remain in the epigenome are termed as orphan genes and the proteins encoded by them but having no experimental evidence of translation are termed as 'Hypothetical Proteins' (HPs).

OBJECTIVES

We have enhanced our former database of Hypothetical Proteins (HP) in human (HypoDB) with added annotation, application programming interfaces and descriptive features. The database hosts 1000+ manually curated records of the known 'unknown' regions in the human genome. The new updated version of HypoDB with functionalities (Blast, Match) is freely accessible at http://www.bioclues.org/hypo2.

METHODS

The total collection of HPs were checked using experimentally validated sets (from Swiss-Prot) or non-experimentally validated set (TrEMBL) or the complete set (UniProtKB). The database was designed with java at the core backend, integrated with databases, viz. EMBL, PIR, HPRD and those including descriptors for structural databases, interaction and association databases.

RESULTS

The HypoDB constituted Application Programming Interfaces (API) for implicitly searching resources linking them to other databases like NCBI Link-out in addition to multiple search capabilities along with advanced searches using integrated bio-tools, viz. Match and BLAST were incorporated.

CONCLUSION

The HypoDB is perhaps the only open-source HP database with a range of tools for common bioinformatics retrievals and serves as a standby reference to researchers who are interested in finding candidate sequences for their potential experimental work.

摘要

背景

有些基因的功能仍不清楚,因为它们可能与基因组中的已知区域没有相似性。这些构成开放阅读框(ORF)且保留在表观基因组中的已知“未知”基因被称为孤儿基因,由它们编码但没有翻译实验证据的蛋白质被称为“假设蛋白质”(HPs)。

目的

我们增强了以前的人类假设蛋白质(HP)数据库(HypoDB),增加了注释、应用程序编程接口和描述性特征。该数据库包含1000多条人工策划的人类基因组中已知“未知”区域的记录。具有功能(Blast、Match)的HypoDB新版本可在http://www.bioclues.org/hypo2免费访问。

方法

使用经过实验验证的数据集(来自Swiss-Prot)或未经实验验证的数据集(TrEMBL)或完整数据集(UniProtKB)对假设蛋白质的总集合进行检查。该数据库以Java为核心后端进行设计,与多个数据库集成,即EMBL、PIR、HPRD以及那些包含结构数据库、相互作用和关联数据库描述符的数据库。

结果

HypoDB构成了应用程序编程接口(API),用于隐式搜索将其与其他数据库链接的资源,如NCBI链接输出,此外还具有多种搜索功能以及使用集成生物工具(即Match和BLAST)的高级搜索功能。

结论

HypoDB可能是唯一具有一系列用于常见生物信息学检索工具的开源HP数据库,为有兴趣为其潜在实验工作寻找候选序列的研究人员提供了备用参考。

相似文献

1
HYPO: A Database of Human Hypothetical Proteins.HYPO:人类假定蛋白质数据库。
Protein Pept Lett. 2018;25(8):799-803. doi: 10.2174/0929866525666180828110444.
5
TACT: Transcriptome Auto-annotation Conducting Tool of H-InvDB.TACT:H-InvDB的转录组自动注释执行工具。
Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W345-9. doi: 10.1093/nar/gkl283.
6
UniSave: the UniProtKB sequence/annotation version database.UniSave:UniProtKB序列/注释版本数据库。
Bioinformatics. 2006 May 15;22(10):1284-5. doi: 10.1093/bioinformatics/btl105. Epub 2006 Mar 21.

引用本文的文献

1
Hypothetical Proteins as Predecessors of Long Non-coding RNAs.作为长链非编码RNA前身的假设性蛋白质。
Curr Genomics. 2020 Nov;21(7):531-535. doi: 10.2174/1389202921999200611155418.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验