一种基于基因的算法,用于识别可能影响说话者声音的因素。
A Gene-Based Algorithm for Identifying Factors That May Affect a Speaker's Voice.
作者信息
Singh Rita
机构信息
Center for Voice Intelligence and Security, Carnegie Mellon University, Pittsburgh, PA 15213, USA.
出版信息
Entropy (Basel). 2023 Jun 2;25(6):897. doi: 10.3390/e25060897.
Over the past decades, many machine-learning- and artificial-intelligence-based technologies have been created to deduce biometric or bio-relevant parameters of speakers from their voice. These voice profiling technologies have targeted a wide range of parameters, from diseases to environmental factors, based largely on the fact that they are known to influence voice. Recently, some have also explored the prediction of parameters whose influence on voice is not easily observable through data-opportunistic biomarker discovery techniques. However, given the enormous range of factors that can possibly influence voice, more informed methods for selecting those that may be potentially deducible from voice are needed. To this end, this paper proposes a simple path-finding algorithm that attempts to find links between vocal characteristics and perturbing factors using cytogenetic and genomic data. The links represent reasonable selection criteria for use by computational by profiling technologies only, and are not intended to establish any unknown biological facts. The proposed algorithm is validated using a simple example from medical literature-that of the clinically observed effects of specific chromosomal microdeletion syndromes on the vocal characteristics of affected people. In this example, the algorithm attempts to link the genes involved in these syndromes to a single example gene (FOXP2) that is known to play a broad role in voice production. We show that in cases where strong links are exposed, vocal characteristics of the patients are indeed reported to be correspondingly affected. Validation experiments and subsequent analyses confirm that the methodology could be potentially useful in predicting the existence of vocal signatures in naïve cases where their existence has not been otherwise observed.
在过去几十年里,人们创造了许多基于机器学习和人工智能的技术,用于从说话者的声音中推断其生物特征或与生物相关的参数。这些语音剖析技术针对的参数范围很广,从疾病到环境因素,这主要是基于已知这些因素会影响声音这一事实。最近,一些人还通过数据机会主义生物标志物发现技术探索了对声音影响不易观察到的参数的预测。然而,鉴于可能影响声音的因素范围巨大,需要更明智的方法来选择那些可能从声音中潜在推断出来的因素。为此,本文提出了一种简单的路径查找算法,该算法试图利用细胞遗传学和基因组数据找到声音特征与干扰因素之间的联系。这些联系仅代表供计算剖析技术使用的合理选择标准,并非旨在确立任何未知的生物学事实。所提出的算法通过医学文献中的一个简单例子进行了验证——即特定染色体微缺失综合征对受影响人群声音特征的临床观察到的影响。在这个例子中,该算法试图将这些综合征中涉及的基因与一个已知在语音产生中起广泛作用的单一示例基因(FOXP2)联系起来。我们表明,在发现有强联系的情况下,确实报告了患者的声音特征受到相应影响。验证实验及后续分析证实,该方法在预测未经其他观察到其存在的简单案例中语音特征的存在方面可能会很有用。