Siniscalchi Carolina M, Hidalgo Oriane, Palazzesi Luis, Pellicer Jaume, Pokorny Lisa, Maurin Olivier, Leitch Ilia J, Forest Felix, Baker William J, Mandel Jennifer R
Department of Biological Sciences Mississippi State University Mississippi State Mississippi 39762 USA.
Department of Biological Sciences University of Memphis Memphis Tennessee 38152 USA.
Appl Plant Sci. 2021 Jun 23;9(7). doi: 10.1002/aps3.11422. eCollection 2021 Jul.
Phylogenetic studies in the Compositae are challenging due to the sheer size of the family and the challenges they pose for molecular tools, ranging from the genomic impact of polyploid events to their very conserved plastid genomes. The search for better molecular tools for phylogenetic studies led to the development of the family-specific Compositae1061 probe set, as well as the universal Angiosperms353 probe set designed for all flowering plants. In this study, we evaluate the extent to which data generated using the family-specific kit and those obtained with the universal kit can be merged for downstream analyses.
We used comparative methods to verify the presence of shared loci between probe sets. Using two sets of eight samples sequenced with Compositae1061 and Angiosperms353, we ran phylogenetic analyses with and without loci flagged as paralogs, a gene tree discordance analysis, and a complementary phylogenetic analysis mixing samples from both sample sets.
Our results show that the Compositae1061 kit provides an average of 721 loci, with 9-46% of them presenting paralogs, while the Angiosperms353 set yields an average of 287 loci, which are less affected by paralogy. Analyses mixing samples from both sets showed that the presence of 30 shared loci in the probe sets allows the combination of data generated in different ways.
Combining data generated using different probe sets opens up the possibility of collaborative efforts and shared data within the synantherological community.
菊科的系统发育研究具有挑战性,这是由于该科规模庞大,且给分子工具带来诸多难题,从多倍体事件的基因组影响到其极为保守的质体基因组。为寻找用于系统发育研究的更好分子工具,人们开发了针对菊科的Compositae1061探针集以及为所有开花植物设计的通用被子植物353探针集。在本研究中,我们评估了使用特定科试剂盒生成的数据与使用通用试剂盒获得的数据可合并用于下游分析的程度。
我们采用比较方法来验证探针集之间共享位点的存在情况。使用两组分别用Compositae1061和被子植物353测序的八个样本,我们进行了有无标记为旁系同源物的位点的系统发育分析、基因树不一致性分析以及混合两个样本集样本的补充系统发育分析。
我们的结果表明,Compositae1061试剂盒平均提供721个位点,其中9% - 46%存在旁系同源物,而被子植物353集平均产生287个位点,受旁系同源性影响较小。混合两组样本的分析表明,探针集中30个共享位点的存在使得以不同方式生成的数据得以合并。
合并使用不同探针集生成的数据为花药学领域的合作努力和共享数据开辟了可能性。