Wisconsin National Primate Research Center, University of Wisconsin-Madison, 1220 Capitol Court, Madison, WI, 53715, USA.
School of Veterinary Medicine, University of Wisconsin-Madison, 2015 Linden Drive, Madison, WI, 53706, USA.
BMC Genomics. 2020 Dec 7;21(1):873. doi: 10.1186/s12864-020-07278-3.
Orang-utans comprise three critically endangered species endemic to the islands of Borneo and Sumatra. Though whole-genome sequencing has recently accelerated our understanding of their evolutionary history, the costs of implementing routine genome screening and diagnostics remain prohibitive. Capitalizing on a tri-fold locus discovery approach, combining data from published whole-genome sequences, novel whole-exome sequencing, and microarray-derived genotype data, we aimed to develop a highly informative gene-focused panel of targets that can be used to address a broad range of research questions.
We identified and present genomic co-ordinates for 175,186 SNPs and 2315 Y-chromosomal targets, plus 185 genes either known or presumed to be pathogenic in cardiovascular (N = 109) or respiratory (N = 43) diseases in humans - the primary and secondary causes of captive orang-utan mortality - or a majority of other human diseases (N = 33). As proof of concept, we designed and synthesized 'SeqCap' hybrid capture probes for these targets, demonstrating cost-effective target enrichment and reduced-representation sequencing.
Our targets are of broad utility in studies of orang-utan ancestry, admixture and disease susceptibility and aetiology, and thus are of value in addressing questions key to the survival of these species. To facilitate comparative analyses, these targets could now be standardized for future orang-utan population genomic studies. The targets are broadly compatible with commercial target enrichment platforms and can be utilized as published here to synthesize applicable probes.
猩猩包括三种极度濒危物种,分布于婆罗洲和苏门答腊岛。尽管全基因组测序最近加快了我们对其进化历史的了解,但实施常规基因组筛查和诊断的成本仍然过高。利用三重基因座发现方法,结合已发表的全基因组序列、新型全外显子组测序和微阵列衍生的基因型数据,我们旨在开发一个信息量丰富的基因靶向面板,可用于解决广泛的研究问题。
我们确定并提供了 175186 个 SNP 和 2315 个 Y 染色体靶点的基因组坐标,以及 185 个已知或假定在人类心血管(N=109)或呼吸(N=43)疾病中具有致病性的基因,这些疾病是圈养猩猩死亡的主要和次要原因,或大多数其他人类疾病(N=33)。作为概念验证,我们为这些靶点设计和合成了“SeqCap”杂交捕获探针,证明了具有成本效益的靶标富集和低代表性测序。
我们的靶点在猩猩的祖先、混合和疾病易感性和病因学研究中具有广泛的用途,因此对于解决这些物种生存的关键问题具有价值。为了促进比较分析,可以将这些靶点标准化,用于未来的猩猩群体基因组研究。这些靶点与商业靶标富集平台广泛兼容,并可按此处所述用于合成适用的探针。