Reyes Pía Francesca Loren, Michoel Tom, Joshi Anagha, Devailly Guillaume
The Roslin Institute, The University of Edinburgh, Easter Bush, Midlothian, EH25 9RG, Scotland, UK.
Comput Struct Biotechnol J. 2017 Aug 26;15:425-432. doi: 10.1016/j.csbj.2017.08.002. eCollection 2017.
Functional annotation transfer across multi-gene family orthologs can lead to functional misannotations. We hypothesised that co-expression network will help predict functional orthologs amongst complex homologous gene families. To explore the use of transcriptomic data available in public domain to identify functionally equivalent ones from all predicted orthologs, we collected genome wide expression data in mouse and rat liver from over 1500 experiments with varied treatments. We used a hyper-graph clustering method to identify clusters of orthologous genes co-expressed in both mouse and rat. We validated these clusters by analysing expression profiles in each species separately, and demonstrating a high overlap. We then focused on genes in 18 homology groups with one-to-many or many-to-many relationships between two species, to discriminate between functionally equivalent and non-equivalent orthologs. Finally, we further applied our method by collecting heart transcriptomic data (over 1400 experiments) in rat and mouse to validate the method in an independent tissue.
跨多基因家族直系同源基因的功能注释转移可能导致功能错误注释。我们假设共表达网络将有助于预测复杂同源基因家族中的功能直系同源基因。为了探索利用公共领域中可用的转录组数据从所有预测的直系同源基因中识别功能等效的基因,我们收集了来自1500多个不同处理实验的小鼠和大鼠肝脏全基因组表达数据。我们使用超图聚类方法来识别在小鼠和大鼠中共同表达的直系同源基因簇。我们通过分别分析每个物种中的表达谱并证明高度重叠来验证这些簇。然后,我们专注于两个物种之间具有一对多或多对多关系的18个同源组中的基因,以区分功能等效和非等效的直系同源基因。最后,我们通过收集大鼠和小鼠的心脏转录组数据(超过1400个实验)进一步应用我们的方法,以在独立组织中验证该方法。