Matsuya Akihiro, Sakate Ryuichi, Kawahara Yoshihiro, Koyanagi Kanako O, Sato Yoshiharu, Fujii Yasuyuki, Yamasaki Chisato, Habara Takuya, Nakaoka Hajime, Todokoro Fusano, Yamaguchi Kaori, Endo Toshinori, Oota Satoshi, Makalowski Wojciech, Ikeo Kazuho, Suzuki Yoshiyuki, Hanada Kousuke, Hashimoto Katsuyuki, Hirai Momoki, Iwama Hisakazu, Saitou Naruya, Hiraki Aiko T, Jin Lihua, Kaneko Yayoi, Kanno Masako, Murakami Katsuhiko, Noda Akiko Ogura, Saichi Naomi, Sanbonmatsu Ryoko, Suzuki Mami, Takeda Jun-ichi, Tanaka Masayuki, Gojobori Takashi, Imanishi Tadashi, Itoh Takeshi
Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics Consortium, Tokyo, Japan.
Nucleic Acids Res. 2008 Jan;36(Database issue):D787-92. doi: 10.1093/nar/gkm878. Epub 2007 Nov 3.
Orthologs are genes in different species that evolved from a common ancestral gene by speciation. Currently, with the rapid growth of transcriptome data of various species, more reliable orthology information is prerequisite for further studies. However, detection of orthologs could be erroneous if pairwise distance-based methods, such as reciprocal BLAST searches, are utilized. Thus, as a sub-database of H-InvDB, an integrated database of annotated human genes (http://h-invitational.jp/), we constructed a fully curated database of evolutionary features of human genes, called 'Evola'. In the process of the ortholog detection, computational analysis based on conserved genome synteny and transcript sequence similarity was followed by manual curation by researchers examining phylogenetic trees. In total, 18 968 human genes have orthologs among 11 vertebrates (chimpanzee, mouse, cow, chicken, zebrafish, etc.), either computationally detected or manually curated orthologs. Evola provides amino acid sequence alignments and phylogenetic trees of orthologs and homologs. In 'd(N)/d(S) view', natural selection on genes can be analyzed between human and other species. In 'Locus maps', all transcript variants and their exon/intron structures can be compared among orthologous gene loci. We expect the Evola to serve as a comprehensive and reliable database to be utilized in comparative analyses for obtaining new knowledge about human genes. Evola is available at http://www.h-invitational.jp/evola/.
直系同源基因是不同物种中通过物种形成从共同祖先基因进化而来的基因。目前,随着各种物种转录组数据的快速增长,更可靠的直系同源信息是进一步研究的先决条件。然而,如果使用基于成对距离的方法,如相互BLAST搜索,直系同源基因的检测可能会出现错误。因此,作为H-InvDB(一个注释人类基因的综合数据库,网址为http://h-invitational.jp/)的子数据库,我们构建了一个经过全面策划的人类基因进化特征数据库,称为“Evola”。在直系同源基因检测过程中,基于保守基因组共线性和转录本序列相似性的计算分析之后,研究人员会通过检查系统发育树进行人工策划。总共有18968个人类基因在11种脊椎动物(黑猩猩、小鼠、牛、鸡、斑马鱼等)中有直系同源基因,这些直系同源基因要么是通过计算检测到的,要么是经过人工策划的。Evola提供直系同源基因和同源基因的氨基酸序列比对和系统发育树。在“d(N)/d(S)视图”中,可以分析人类与其他物种之间基因的自然选择情况。在“基因座图谱”中,可以比较直系同源基因座之间的所有转录本变体及其外显子/内含子结构。我们期望Evola能够作为一个全面且可靠的数据库,用于比较分析,以获取有关人类基因的新知识。Evola可在http://www.h-invitational.jp/evola/获取。