Zych Konrad, Li Yang, van der Velde Joeri K, Joosen Ronny V L, Ligterink Wilco, Jansen Ritsert C, Arends Danny
University of Groningen, Groningen Bioinformatics Centre, Nijenborgh 7, Groningen, 9747, AG, The Netherlands.
Jagiellonian University, Faculty of Biochemistry, Biophysics and Biotechnology, Gronostajowa Street 7, Krakow, 30-387, Poland.
BMC Bioinformatics. 2015 Feb 19;16(1):51. doi: 10.1186/s12859-015-0475-6.
Genetic markers and maps are instrumental in quantitative trait locus (QTL) mapping in segregating populations. The resolution of QTL localization depends on the number of informative recombinations in the population and how well they are tagged by markers. Larger populations and denser marker maps are better for detecting and locating QTLs. Marker maps that are initially too sparse can be saturated or derived de novo from high-throughput omics data, (e.g. gene expression, protein or metabolite abundance). If these molecular phenotypes are affected by genetic variation due to a major QTL they will show a clear multimodal distribution. Using this information, phenotypes can be converted into genetic markers.
The Pheno2Geno tool uses mixture modeling to select phenotypes and transform them into genetic markers suitable for construction and/or saturation of a genetic map. Pheno2Geno excludes candidate genetic markers that show evidence for multiple possibly epistatically interacting QTL and/or interaction with the environment, in order to provide a set of robust markers for follow-up QTL mapping. We demonstrate the use of Pheno2Geno on gene expression data of 370,000 probes in 148 A. thaliana recombinant inbred lines. Pheno2Geno is able to saturate the existing genetic map, decreasing the average distance between markers from 7.1 cM to 0.89 cM, close to the theoretical limit of 0.68 cM (with 148 individuals we expect a recombination every 100/148=0.68 cM); this pinpointed almost all of the informative recombinations in the population.
The Pheno2Geno package makes use of genome-wide molecular profiling and provides a tool for high-throughput de novo map construction and saturation of existing genetic maps. Processing of the showcase dataset takes less than 30 minutes on an average desktop PC. Pheno2Geno improves QTL mapping results at no additional laboratory cost and with minimum computational effort. Its results are formatted for direct use in R/qtl, the leading R package for QTL studies. Pheno2Geno is freely available on CRAN under "GNU GPL v3". The Pheno2Geno package as well as the tutorial can also be found at: http://pheno2geno.nl .
遗传标记和图谱对于在分离群体中进行数量性状基因座(QTL)定位至关重要。QTL定位的分辨率取决于群体中信息性重组的数量以及标记对它们的标记程度。更大的群体和更密集的标记图谱更有利于检测和定位QTL。最初过于稀疏的标记图谱可以通过高通量组学数据(例如基因表达、蛋白质或代谢物丰度)进行饱和或从头推导。如果这些分子表型受到主要QTL的遗传变异影响,它们将呈现明显的多峰分布。利用这些信息,可以将表型转化为遗传标记。
Pheno2Geno工具使用混合建模来选择表型并将其转化为适合构建和/或饱和遗传图谱的遗传标记。Pheno2Geno排除显示存在多个可能上位性相互作用的QTL和/或与环境相互作用证据的候选遗传标记,以便为后续的QTL定位提供一组可靠的标记。我们在148个拟南芥重组自交系中370,000个探针的基因表达数据上展示了Pheno2Geno的使用。Pheno2Geno能够使现有的遗传图谱饱和,将标记之间的平均距离从7.1厘摩降低到0.89厘摩,接近理论极限0.68厘摩(对于148个个体,我们预计每100/148 = 0.68厘摩发生一次重组);这几乎精确确定了群体中所有的信息性重组。
Pheno2Geno软件包利用全基因组分子谱分析,为高通量从头构建图谱和使现有遗传图谱饱和提供了一个工具。在普通台式计算机上处理展示数据集平均耗时不到30分钟。Pheno2Geno无需额外的实验室成本且只需最少的计算工作量就能改善QTL定位结果。其结果的格式可直接用于R/qtl,这是用于QTL研究的领先R软件包。Pheno2Geno可在CRAN上以“GNU GPL v3”许可免费获取。Pheno2Geno软件包以及教程也可在以下网址找到:http://pheno2geno.nl 。