最多等待 10 秒,若超时请稍后重试。
Benjamin Huremagic
Lirias (KU Leuven). 2026 May 4.
Neurodevelopmental disorders (NDDs) are a genetically and clinically heterogeneous group of conditions. Traditional short-read sequencing (SRS) methods leave a significant proportion of cases unresolved. This thesis explores how long-read sequencing (LRS), integrated methylation profiling, and federated data-sharing infrastructures can transform NDD discovery by bridging critical technological and systemic gaps in variant detection, interpretation, and cross-institutional collaboration. We demonstrate that long-read whole-genome sequencing (lrWGS), in combination with optimized bioinformatic pipelines such as Sniffles2 and trio/population-based filtering, improve the resolution and recall of structural variants (SV). The analysis of 26 families' LRS enabled the discovery of novel SVs, which went undetectable with conventional short-read whole-genome sequencing (srWGS). By incorporating the telomere-to-telomere (T2T) reference genome, we uncovered variants missed by GRCh38, providing evidence that a transition toward more complete reference genomes will improve genetic diagnosis. LRS enables the direct detection of native DNA methylation. We first demonstrate that ONT based methylome calling is concordant with other methylation mapping methods. Subsequently, we explored haplotype-aware methylation mapping, facilitating the study of X-chromosome inactivation and imprinting status. In female patients, we observed skewed X-inactivation patterns that had direct clinical implications. By combining variant and epigenetic analysis within a single assay, this approach consolidates what traditionally required multiple separate tests, thereby reducing the diagnostic timeline and increasing sensitivity for complex molecular diagnoses. Subsequently, we wondered whether the chromatinopathy specific episignatures can be mapped. Using a support vector machine (SVM), we classified 17 out of 19 patients with known NDD based on their methylome profiles, confirming the presence of disease-specific episignatures. This approach not only enables molecular classification of syndromic NDDs but also supports the reclassification of variants of uncertain significance (VUS) and will offer insights into complex regulatory mechanisms. Our findings reinforce the diagnostic value of LRS not just as a research tool, but as a viable first-line method for clinical use in NDDs. Genomic and epigenomic profiling enhances the genomic analysis of individual genomes. However, to uncover novel causes of NDD, large datasets are required, which requires collaboration across institutions. Thus far, most sites work in isolation. To enable large-scale collaboration without compromising patient privacy or violating data protection frameworks like GDPR, we developed two federated platforms: MINDDS-Connect and WiNGS. MINDDS-Connect is a federated metadata-sharing platform designed for virtual meta-cohort building. It enables researchers to query sample availability across multiple centers using standardized Human Phenotype Ontology (HPO) and Online Mendelian Inheritance in Man (OMIM) terms, without requiring physical data transfer. MINDDS-Connect allows institutions to retain data control locally while sharing high-level, de-identified metadata, facilitating cohort matching for rare disorders. A pilot deployment across five centers provided access to 900 samples, validating the platform's utility for collaborative cohort construction and hypothesis generation. With 22q11.2DSas a use case, we illustrate the potential of the system for condition-specific sample identification and deep phenotyping studies. Building on this model, WiNGS introduces federated variant-level data sharing and annotation across multiple institutions. Unlike traditional centralized repositories, WiNGS retains raw genomic and phenotypic data on-premise. Analysis and filtering are executed locally through Docker-based deployments, and querying is enabled via a RESTful API. WiNGS currently manages over 6,400 SNV samples, 1,500 CNV samples, and 500 SV samples. By supporting trio-based filtering and annotation, WiNGS can enhance diagnostic sensitivity and reduce false positives in real time. WiNGS serves as a scalable framework for federated genomic analysis and variant reclassification. Collectively, this thesis underscores the transformative potential of integrating LRS, methylation profiling, and federated informatics into a unified ecosystem. Our data support the use of T2T references and lrWGS based methylation calling for routine diagnostics and present two deployable tools for cross-center genomic collaboration. By bridging diagnostic, epigenetic, and systemic challenges, the tools and methods developed in this thesis lay the groundwork for a more inclusive, precise, and efficient mapping of the genetic causes of rare diseases.
神经发育障碍(NDDs)是一组在遗传和临床上具有异质性的病症。传统的短读长测序(SRS)方法使相当一部分病例无法得到解决。本论文探讨了长读长测序(LRS)、综合甲基化分析和联邦数据共享基础设施如何通过弥合变异检测、解释和跨机构合作中的关键技术和系统差距来改变NDD的发现。我们证明,长读长全基因组测序(lrWGS)与优化的生物信息学流程(如Sniffles2和基于三联体/群体的过滤)相结合,可提高结构变异(SV)的分辨率和召回率。对26个家庭的LRS分析发现了新的SV,而传统的短读长全基因组测序(srWGS)无法检测到这些变异。通过纳入端粒到端粒(T2T)参考基因组,我们发现了GRCh38遗漏的变异,这表明向更完整的参考基因组过渡将改善基因诊断。LRS能够直接检测天然DNA甲基化。我们首先证明基于纳米孔测序技术(ONT)的甲基化组分析与其他甲基化图谱绘制方法一致。随后,我们探索了单倍型感知甲基化图谱绘制,促进了对X染色体失活和印记状态的研究。在女性患者中,我们观察到了具有直接临床意义的偏态X染色体失活模式。通过在单一检测中结合变异和表观遗传分析,这种方法整合了传统上需要多个单独检测的内容,从而缩短了诊断时间线并提高了复杂分子诊断的灵敏度。随后,我们想知道是否可以绘制染色质病特异性的表观特征。我们使用支持向量机(SVM),根据甲基化组图谱对19例已知NDD患者中的17例进行了分类,证实了疾病特异性表观特征 的存在。这种方法不仅能够对综合征性NDD进行分子分类,还支持对意义未明变异(VUS)的重新分类,并将为复杂的调控机制提供见解。我们的研究结果强化了LRS不仅作为一种研究工具,而且作为NDD临床应用中可行的一线方法的诊断价值。基因组和表观基因组分析增强了个体基因组的基因组分析。然而,为了揭示NDD的新病因,需要大型数据集,这需要跨机构合作。到目前为止,大多数机构都是独立工作的。为了在不损害患者隐私或违反数据保护框架(如GDPR)的情况下实现大规模合作,我们开发了两个联邦平台:MINDDS-Connect和WiNGS。MINDDS-Connect是一个用于虚拟元队列构建的联邦元数据共享平台。它使研究人员能够使用标准化的人类表型本体(HPO)和《人类孟德尔遗传》(OMIM)术语查询多个中心的样本可用性,而无需进行物理数据传输。MINDDS-Connect允许各机构在本地保留数据控制权,同时共享高级别的、去识别化的元数据,便于对罕见疾病进行队列匹配。在五个中心进行的试点部署提供了900个样本的访问权限,验证了该平台在协作队列构建和假设生成方面的实用性。以22q11.2缺失综合征(22q11.2DS)为例,我们展示了该系统在特定疾病样本识别和深度表型研究方面的潜力。基于此模型,WiNGS引入了跨多个机构 的联邦变异级数据共享和注释。与传统的集中式存储库不同,WiNGS在本地保留原始基因组和表型数据。分析和过滤通过基于Docker的部署在本地执行,查询通过RESTful API实现。WiNGS目前管理着超过6400个单核苷酸变异(SNV)样本、1500个拷贝数变异(CNV)样本和500个结构变异(SV)样本。通过支持基于三联体的过滤和注释,WiNGS可以实时提高诊断灵敏度并减少假阳性。WiNGS作为联邦基因组分析和变异重新分类的可扩展框架。总体而言,本论文强调了将LRS、甲基化分析和联邦信息学整合到一个统一生态系统中的变革潜力。我们的数据支持使用T2T参考基因组和基于lrWGS的甲基化分析进行常规诊断,并展示了两个用于跨中心基因组合作的可部署工具。通过弥合诊断、表观遗传和系统方面的挑战,本论文中开发的工具和方法为更全面、精确和高效地绘制罕见疾病的遗传病因奠定了基础。