1Sciome, LLC, Research Triangle Park, North Carolina.
2National Health and Environmental Effects Research Laboratory, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina.
Zebrafish. 2019 Aug;16(4):331-347. doi: 10.1089/zeb.2018.1720. Epub 2019 Jun 12.
Sentinel gene sets have been developed with the purpose of maximizing the information from targeted transcriptomic platforms. We recently described the development of an S1500+ sentinel gene set, which was built for the human transcriptome, utilizing a data- and knowledge-driven hybrid approach to select a small subset of genes that optimally capture transcriptional diversity, correlation with other genes based on large-scale expression profiling, and known pathway annotation within the human genome. While this detailed bioinformatics approach for gene selection can in principle be applied to other species, the reliability of the resulting gene set depends on availability of a large body of transcriptomics data. For the model organism zebrafish, we aimed to create a similar sentinel gene set (Zf S1500+ gene set); however, there is insufficient standardized expression data in the public domain to train the gene correlation model. Therefore, our strategy was to use human-zebrafish ortholog mapping of the human S1500+ genes and nominations from experts in the zebrafish scientific community. In this study, we present the bioinformatics curation and refinement process to produce the final Zf S1500+ gene set, explore whole transcriptome extrapolation using this gene set, and assess pathway-level inference. This gene set will add value to targeted high-throughput transcriptomics in zebrafish for toxicogenomic screening and other research domains.
Sentinel 基因集的开发目的是最大化靶向转录组学平台的信息。我们最近描述了一种人类转录组的 S1500+哨兵基因集的开发,该基因集利用数据和知识驱动的混合方法,选择一小部分基因,这些基因可以最佳地捕获转录多样性、基于大规模表达谱与其他基因的相关性,以及人类基因组中的已知途径注释。虽然这种详细的基因选择生物信息学方法原则上可以应用于其他物种,但所得基因集的可靠性取决于大量转录组学数据的可用性。对于模式生物斑马鱼,我们旨在创建一个类似的哨兵基因集(Zf S1500+基因集);然而,在公共领域中,用于训练基因相关性模型的标准化表达数据不足。因此,我们的策略是使用人类-斑马鱼同源基因映射人类 S1500+基因,并接受斑马鱼科学界专家的提名。在这项研究中,我们提出了生物信息学策展和精炼过程,以产生最终的 Zf S1500+基因集,探索使用该基因集进行全转录组外推,并评估途径水平推理。该基因集将为斑马鱼的靶向高通量转录组学在毒基因组学筛选和其他研究领域增加价值。