Saak Samira, Oetting Dirk, Kollmeier Birger, Buhl Mareike
Medizinische Physik, Carl von Ossietzky Universität Oldenburg, Oldenburg, Germany.
Cluster of Excellence "Hearing4all", Carl von Ossietzky Universität Oldenburg, Oldenburg, Germany.
Trends Hear. 2025 Jan-Dec;29:23312165251349617. doi: 10.1177/23312165251349617. Epub 2025 Jun 30.
Audiological datasets contain valuable knowledge about hearing loss in patients, which can be uncovered using data-driven techniques. Our previous approach summarized patient information from one audiological dataset into distinct Auditory Profiles (APs). To obtain a better estimate of the audiological patient population, however, patient patterns must be analyzed across multiple, separated datasets, and finally, be integrated into a combined set of APs. This study aimed at extending the existing profile generation pipeline with an AP merging step, enabling the combination of APs from different datasets based on their similarity across audiological measures. The 13 previously generated APs ( = 595) were merged with 31 newly generated APs from a second dataset ( = 1,272) using a similarity score derived from the overlapping densities of common features across the two datasets. To ensure clinical applicability, random forest models were created for various scenarios, encompassing different combinations of audiological measures. A new set with 13 combined APs is proposed, providing separable profiles, which still capture detailed patient information from various test outcome combinations. The classification performance across these profiles is satisfactory. The best performance was achieved using a combination of loudness scaling, audiogram, and speech test information, while single measures performed worst. The enhanced profile generation pipeline demonstrates the feasibility of combining APs across datasets, which should generalize to all datasets and could lead to an interpretable global profile set in the future. The classification models maintain clinical applicability.
听力学数据集包含有关患者听力损失的宝贵知识,这些知识可以通过数据驱动技术来挖掘。我们之前的方法将一个听力学数据集中的患者信息总结为不同的听觉特征(APs)。然而,为了更好地估计听力学患者群体,必须跨多个独立数据集分析患者模式,最后将其整合到一个组合的APs集合中。本研究旨在通过一个AP合并步骤扩展现有的特征生成流程,从而能够根据不同数据集在听力学测量方面的相似性来合并APs。使用从两个数据集共同特征的重叠密度得出的相似性分数,将之前生成的13个APs(n = 595)与来自第二个数据集的31个新生成的APs(n = 1272)进行合并。为确保临床适用性,针对各种情况创建了随机森林模型,涵盖听力学测量的不同组合。提出了一个包含13个组合APs的新集合,提供了可分离的特征,这些特征仍然能够从各种测试结果组合中捕捉详细的患者信息。这些特征的分类性能令人满意。使用响度缩放、听力图和言语测试信息的组合实现了最佳性能,而单一测量的性能最差。增强后的特征生成流程证明了跨数据集合并APs的可行性,这应该可以推广到所有数据集,并可能在未来导致一个可解释的全局特征集。分类模型保持了临床适用性。