Korean Bioinformation Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Korea.
BMC Genomics. 2011 Nov 30;12 Suppl 3(Suppl 3):S3. doi: 10.1186/1471-2164-12-S3-S3.
Hepatocellular carcinoma (HCC) is the fifth most common cancer worldwide. A number of molecular profiling studies have investigated the changes in gene and protein expression that are associated with various clinicopathological characteristics of HCC and generated a wealth of scattered information, usually in the form of gene signature tables. A database of the published HCC gene signatures would be useful to liver cancer researchers seeking to retrieve existing differential expression information on a candidate gene and to make comparisons between signatures for prioritization of common genes. A challenge in constructing such database is that a direct import of the signatures as appeared in articles would lead to a loss or ambiguity of their context information that is essential for a correct biological interpretation of a gene's expression change. This challenge arises because designation of compared sample groups is most often abbreviated, ad hoc, or even missing from published signature tables. Without manual curation, the context information becomes lost, leading to uninformative database contents. Although several databases of gene signatures are available, none of them contains informative form of signatures nor shows comprehensive coverage on liver cancer. Thus we constructed Liverome, a curated database of liver cancer-related gene signatures with self-contained context information.
Liverome's data coverage is more than three times larger than any other signature database, consisting of 143 signatures taken from 98 HCC studies, mostly microarray and proteome, and involving 6,927 genes. The signatures were post-processed into an informative and uniform representation and annotated with an itemized summary so that all context information is unambiguously self-contained within the database. The signatures were further informatively named and meaningfully organized according to ten functional categories for guided browsing. Its web interface enables a straightforward retrieval of known differential expression information on a query gene and a comparison of signatures to prioritize common genes. The utility of Liverome-collected data is shown by case studies in which useful biological insights on HCC are produced.
Liverome database provides a comprehensive collection of well-curated HCC gene signatures and straightforward interfaces for gene search and signature comparison as well. Liverome is available at http://liverome.kobic.re.kr.
肝细胞癌(HCC)是全球第五大常见癌症。许多分子谱研究已经调查了与 HCC 的各种临床病理特征相关的基因和蛋白质表达变化,并产生了大量分散的信息,通常以基因特征表的形式呈现。发布的 HCC 基因特征数据库将对肝癌研究人员有用,他们可以检索候选基因的现有差异表达信息,并在特征之间进行比较,以确定常见基因的优先级。构建此类数据库的一个挑战是,直接导入文章中出现的特征会导致其上下文信息丢失或模糊,而这些信息对于正确解释基因表达变化的生物学意义至关重要。这个挑战源于出版的特征表中对比样本组的名称通常是缩写的、临时的,甚至是缺失的。如果没有人工整理,上下文信息就会丢失,导致数据库内容变得毫无意义。虽然有几个基因特征数据库,但它们都不包含有意义的特征形式,也没有全面涵盖肝癌。因此,我们构建了 Liverome,这是一个具有自我包含上下文信息的肝癌相关基因特征的精心整理数据库。
Liverome 的数据覆盖范围比任何其他签名数据库都要大三倍多,包含 143 个来自 98 项 HCC 研究的签名,主要是微阵列和蛋白质组学,涉及 6927 个基因。这些特征经过后处理,以一种信息丰富且统一的表示形式呈现,并附有详细的摘要,从而使所有上下文信息都在数据库中明确地自我包含。这些特征根据十个功能类别进行了有意义的命名和组织,以便于浏览。其 Web 界面可方便地检索查询基因的已知差异表达信息,并比较特征以确定常见基因的优先级。通过对 HCC 进行案例研究,展示了 Liverome 收集数据的实用性,从中获得了有用的生物学见解。
Liverome 数据库提供了全面的 HCC 基因特征集合,以及简单易用的基因搜索和特征比较界面。Liverome 可在 http://liverome.kobic.re.kr 上获取。