Department of Informatics, King's College London, London, UK.
Genomics England, Charterhouse Square, London, UK.
Bioinformatics. 2018 Sep 1;34(17):i743-i747. doi: 10.1093/bioinformatics/bty601.
Conserved non-coding elements (CNEs) represent an enigmatic class of genomic elements which, despite being extremely conserved across evolution, do not encode for proteins. Their functions are still largely unknown. Thus, there exists a need to systematically investigate their roles in genomes. Towards this direction, identifying sets of CNEs in a wide range of organisms is an important first step. Currently, there are no tools published in the literature for systematically identifying CNEs in genomes.
We fill this gap by presenting CNEFinder; a tool for identifying CNEs between two given DNA sequences with user-defined criteria. The results presented here show the tool's ability of identifying CNEs accurately and efficiently. CNEFinder is based on a k-mer technique for computing maximal exact matches. The tool thus does not require or compute whole-genome alignments or indexes, such as the suffix array or the Burrows Wheeler Transform (BWT), which makes it flexible to use on a wide scale.
Free software under the terms of the GNU GPL (https://github.com/lorrainea/CNEFinder).
保守的非编码元件(CNEs)代表了一类神秘的基因组元件,尽管它们在进化过程中高度保守,但不编码蛋白质。它们的功能仍然很大程度上未知。因此,有必要系统地研究它们在基因组中的作用。为此,在广泛的生物体中识别 CNE 集是重要的第一步。目前,文献中没有发表用于系统地在基因组中识别 CNE 的工具。
我们通过提出 CNEFinder 填补了这一空白;这是一种用于根据用户定义的标准在两个给定的 DNA 序列之间识别 CNE 的工具。这里呈现的结果展示了该工具准确而高效地识别 CNE 的能力。CNEFinder 基于用于计算最大精确匹配的 k-mer 技术。因此,该工具不需要或计算全基因组比对或索引,例如后缀数组或 Burrows Wheeler Transform(BWT),这使其在广泛的范围内使用具有灵活性。
根据 GNU GPL 的条款提供免费软件(https://github.com/lorrainea/CNEFinder)。