Garda Samuele, Schwarz Jana Marie, Schuelke Markus, Leser Ulf, Seelow Dominik
Knowledge Management in Bioinformatics, Institute for Computer Science, Humboldt-Universität zu Berlin, Unter den Linden 6, 10099 Berlin, Germany.
Department of Neuropediatrics, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany.
Med Genet. 2021 Aug 14;33(2):167-177. doi: 10.1515/medgen-2021-2075. eCollection 2021 Jun.
High-throughput technologies have led to a continuously growing amount of information about regulatory features in the genome. A wealth of data generated by large international research consortia is available from online databases. Disease-driven studies provide details on specific DNA elements or epigenetic modifications regulating gene expression in specific cellular and developmental contexts, but these results are usually only published in scientific articles. All this information can be helpful in interpreting variants in the regulatory genome. This review describes a selection of high-profile data sources providing information on the non-coding genome, as well as pitfalls and techniques to search and capture information from the literature.
高通量技术已产生了关于基因组调控特征的不断增长的信息量。大型国际研究联盟生成的大量数据可从在线数据库获取。疾病驱动的研究提供了在特定细胞和发育背景下调节基因表达的特定DNA元件或表观遗传修饰的详细信息,但这些结果通常仅发表在科学文章中。所有这些信息都有助于解释调控基因组中的变异。本综述描述了一些提供非编码基因组信息的重要数据源,以及从文献中搜索和获取信息的陷阱和技术。