Center for Biotechnology, Bielefeld University, Universitaetsstrasse 25, D-33615 Bielefeld, Germany.
Nucleic Acids Res. 2012 Jan;40(Database issue):D1211-5. doi: 10.1093/nar/gkr1047. Epub 2011 Nov 12.
T-DNA insertion mutants are very valuable for reverse genetics in Arabidopsis thaliana. Several projects have generated large sequence-indexed collections of T-DNA insertion lines, of which GABI-Kat is the second largest resource worldwide. User access to the collection and its Flanking Sequence Tags (FSTs) is provided by the front end SimpleSearch (http://www.GABI-Kat.de). Several significant improvements have been implemented recently. The database now relies on the TAIRv10 genome sequence and annotation dataset. All FSTs have been newly mapped using an optimized procedure that leads to improved accuracy of insertion site predictions. A fraction of the collection with weak FST yield was re-analysed by generating new FSTs. Along with newly found predictions for older sequences about 20,000 new FSTs were included in the database. Information about groups of FSTs pointing to the same insertion site that is found in several lines but is real only in a single line are included, and many problematic FST-to-line links have been corrected using new wet-lab data. SimpleSearch currently contains data from ~71,000 lines with predicted insertions covering 62.5% of the 27,206 nuclear protein coding genes, and offers insertion allele-specific data from 9545 confirmed lines that are available from the Nottingham Arabidopsis Stock Centre.
T-DNA 插入突变体在拟南芥的反向遗传学中非常有价值。几个项目已经生成了大量的 T-DNA 插入系序列索引库,其中 GABI-Kat 是全球第二大资源。该库及其侧翼序列标签 (FST) 的用户访问是通过前端 SimpleSearch (http://www.GABI-Kat.de) 提供的。最近已经实施了一些重大改进。该数据库现在依赖于 TAIRv10 基因组序列和注释数据集。所有的 FST 都使用新的优化程序进行了重新映射,这提高了插入位点预测的准确性。对产量较弱的部分库进行了重新分析,通过生成新的 FST 进行了重新分析。随着对旧序列的新发现预测,大约有 20000 个新的 FST 被包含在数据库中。关于指向同一插入位点的 FST 组的信息被包含在内,这些 FST 在几条线中找到,但在一条线中是真实的,并且使用新的湿实验室数据纠正了许多有问题的 FST-线链接。SimpleSearch 目前包含来自约 71000 条线的数据,这些线的预测插入覆盖了 27206 个核蛋白编码基因的 62.5%,并提供了来自诺丁汉拟南芥资源中心的 9545 条已确认线的插入等位基因特异性数据。