Villar Emilie, Zweig Nathanaël, Vincens Pierre, Cruz de Carvalho Helena, Duchene Carole, Liu Shun, Monteil Raphael, Dorrell Richard G, Fabris Michele, Vandepoele Klaas, Bowler Chris, Falciatore Angela
Institut de Biologie de l'École Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, Paris, 75005, France.
EV Consulting, Marseille, France.
Plant J. 2025 Mar;121(6):e70061. doi: 10.1111/tpj.70061.
Diatoms are prominent microalgae found in all aquatic environments. Over the last 20 years, thanks to the availability of genomic and genetic resources, diatom species such as Phaeodactylum tricornutum and Thalassiosira pseudonana have emerged as valuable experimental model systems for exploring topics ranging from evolution to cell biology, (eco)physiology, and biotechnology. Since the first genome sequencing projects initiated more than 20 years ago, numerous genome-enabled datasets have been generated, based on RNA-Seq and proteomics experiments, epigenomes, and ecotype variant analysis. Unfortunately, these resources, generated by various laboratories, are often in disparate formats and challenging to access and analyze. Here we present DiatOmicBase, a genome portal gathering comprehensive omics resources from P. tricornutum and T. pseudonana to facilitate the exploration of dispersed public datasets and the design of new experiments based on the prior-art. DiatOmicBase provides gene annotations, transcriptomic profiles and a genome browser with ecotype variants, histone and methylation marks, transposable elements, non-coding RNAs, and read densities from RNA-Seq experiments. We developed a semi-automatically updated transcriptomic module to explore both publicly available RNA-Seq experiments and users' private datasets. Using gene-level expression data, users can perform exploratory data analysis, differential expression, pathway analysis, biclustering, and co-expression network analysis. Users can create heatmaps to visualize pre-computed comparisons for selected gene subsets. Automatic access to other bioinformatic resources and tools for diatom comparative and functional genomics is also provided. Focusing on the resources currently centralized for P. tricornutum, we showcase several examples of how DiatOmicBase strengthens molecular research on diatoms, making these organisms accessible to a broad research community.
硅藻是在所有水生环境中都能发现的重要微藻。在过去20年里,由于基因组和遗传资源的可得性,三角褐指藻和假微型海链藻等硅藻物种已成为有价值的实验模型系统,可用于探索从进化到细胞生物学、(生态)生理学和生物技术等诸多主题。自20多年前启动首个基因组测序项目以来,基于RNA测序和蛋白质组学实验、表观基因组以及生态型变异分析,已生成了大量基于基因组的数据集。不幸的是,这些由不同实验室生成的资源往往格式各异,获取和分析都具有挑战性。在此,我们展示DiatOmicBase,这是一个基因组门户,收集了来自三角褐指藻和假微型海链藻的全面组学资源,以促进对分散的公共数据集的探索以及基于现有技术设计新实验。DiatOmicBase提供基因注释、转录组图谱以及一个带有生态型变异、组蛋白和甲基化标记、转座元件、非编码RNA以及RNA测序实验读取密度的基因组浏览器。我们开发了一个半自动更新的转录组模块,以探索公开可用的RNA测序实验和用户的私有数据集。使用基因水平的表达数据,用户可以进行探索性数据分析、差异表达分析、通路分析、双聚类分析和共表达网络分析。用户可以创建热图,以可视化所选基因子集的预先计算的比较结果。还提供了对用于硅藻比较和功能基因组学的其他生物信息学资源和工具的自动访问。聚焦于目前为三角褐指藻集中的资源,我们展示了几个例子,说明DiatOmicBase如何加强对硅藻的分子研究,使广大研究群体能够接触到这些生物体。