Beyster Center for Molecular Genomics of Neuropsychiatric Diseases, University of California, San Diego, La Jolla, California, USA.
Nat Methods. 2012 Jul 1;9(8):819-21. doi: 10.1038/nmeth.2085.
Detecting genomic structural variants from high-throughput sequencing data is a complex and unresolved challenge. We have developed a statistical learning approach, based on Random Forests, that integrates prior knowledge about the characteristics of structural variants and leads to improved discovery in high-throughput sequencing data. The implementation of this technique, forestSV, offers high sensitivity and specificity coupled with the flexibility of a data-driven approach.
从高通量测序数据中检测基因组结构变异是一个复杂且尚未解决的挑战。我们开发了一种基于随机森林的统计学习方法,该方法整合了关于结构变异特征的先验知识,可提高高通量测序数据中的发现能力。该技术的实现(forestSV)具有高灵敏度和特异性,以及数据驱动方法的灵活性。