State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, 210096, Nanjing, China.
Nat Commun. 2021 Oct 14;12(1):6014. doi: 10.1038/s41467-021-26312-w.
We present a user-friendly and transferable genome-wide DNA G-quadruplex (G4) profiling method that identifies G4 structures from ordinary whole-genome resequencing data by seizing the slight fluctuation of sequencing quality. In the human genome, 736,689 G4 structures were identified, of which 45.9% of all predicted canonical G4-forming sequences were characterized. Over 89% of the detected canonical G4s were also identified by combining polymerase stop assays with next-generation sequencing. Testing using public datasets of 6 species demonstrated that the present method is widely applicable. The detection rates of predicted canonical quadruplexes ranged from 32% to 58%. Because single nucleotide variations (SNVs) influence the formation of G4 structures and have individual differences, the given method is available to identify and characterize G4s genome-wide for specific individuals.
我们提出了一种用户友好且可转移的全基因组 DNA G-四链体 (G4) 分析方法,该方法通过利用测序质量的细微波动,从普通的全基因组重测序数据中识别 G4 结构。在人类基因组中,鉴定出了 736689 个 G4 结构,其中 45.9%的所有预测的典型 G4 形成序列都具有特征。通过将聚合酶停止测定法与下一代测序相结合,超过 89%的检测到的典型 G4 也被鉴定出来。使用 6 个物种的公共数据集进行测试表明,该方法具有广泛的适用性。预测的典型四联体的检测率范围为 32%至 58%。由于单核苷酸变异 (SNVs) 影响 G4 结构的形成并具有个体差异,因此该方法可用于为特定个体识别和表征全基因组范围内的 G4。