Suchting Robert, Gowin Joshua L, Green Charles E, Walss-Bass Consuelo, Lane Scott D
Department of Psychiatry and Behavioral Sciences, McGovern Medical School, University of Texas, Houston, TX, United States.
Section on Human Psychopharmacology, National Institute on Alcohol Abuse and Alcoholism, Rockville, MD, United States.
Front Behav Neurosci. 2018 May 7;12:89. doi: 10.3389/fnbeh.2018.00089. eCollection 2018.
: Given datasets with a large or diverse set of predictors of aggression, machine learning (ML) provides efficient tools for identifying the most salient variables and building a parsimonious statistical model. ML techniques permit efficient exploration of data, have not been widely used in aggression research, and may have utility for those seeking prediction of aggressive behavior. : The present study examined predictors of aggression and constructed an optimized model using ML techniques. Predictors were derived from a dataset that included demographic, psychometric and genetic predictors, specifically FK506 binding protein 5 (FKBP5) polymorphisms, which have been shown to alter response to threatening stimuli, but have not been tested as predictors of aggressive behavior in adults. : The data analysis approach utilized component-wise gradient boosting and model reduction via backward elimination to: (a) select variables from an initial set of 20 to build a model of trait aggression; and then (b) reduce that model to maximize parsimony and generalizability. : From a dataset of = 47 participants, component-wise gradient boosting selected 8 of 20 possible predictors to model Buss-Perry Aggression Questionnaire (BPAQ) total score, with = 0.66. This model was simplified using backward elimination, retaining six predictors: smoking status, psychopathy (interpersonal manipulation and callous affect), childhood trauma (physical abuse and neglect), and the FKBP5_13 gene (rs1360780). The six-factor model approximated the initial eight-factor model at 99.4% of . : Using an inductive data science approach, the gradient boosting model identified predictors consistent with previous experimental work in aggression; specifically psychopathy and trauma exposure. Additionally, allelic variants in FKBP5 were identified for the first time, but the relatively small sample size limits generality of results and calls for replication. This approach provides utility for the prediction of aggression behavior, particularly in the context of large multivariate datasets.
对于包含大量或多样攻击预测变量的数据集,机器学习(ML)提供了有效的工具来识别最显著的变量并构建一个简洁的统计模型。ML技术允许对数据进行有效探索,在攻击研究中尚未得到广泛应用,对于那些寻求预测攻击行为的人可能具有实用价值。本研究考察了攻击的预测变量,并使用ML技术构建了一个优化模型。预测变量来自一个数据集,该数据集包括人口统计学、心理测量学和基因预测变量,特别是FK506结合蛋白5(FKBP5)多态性,已被证明会改变对威胁性刺激的反应,但尚未作为成人攻击行为的预测变量进行测试。数据分析方法采用逐分量梯度提升和通过向后消除进行模型简化,以:(a)从最初的20个变量中选择变量来构建特质攻击模型;然后(b)简化该模型以最大化简约性和可推广性。在一个n = 47名参与者的数据集上,逐分量梯度提升从20个可能的预测变量中选择了8个来对布斯-佩里攻击问卷(BPAQ)总分进行建模,R² = 0.66。使用向后消除对该模型进行简化,保留了六个预测变量:吸烟状况、精神病态(人际操纵和冷酷情感)、童年创伤(身体虐待和忽视)以及FKBP5_13基因(rs1360780)。六因素模型在R²的99.4%水平上近似于最初的八因素模型。使用归纳数据科学方法,梯度提升模型识别出与先前攻击实验工作一致的预测变量;特别是精神病态和创伤暴露。此外,首次识别出FKBP5中的等位基因变体,但相对较小的样本量限制了结果的普遍性,需要进行重复验证。这种方法对于攻击行为的预测具有实用价值,特别是在大型多变量数据集的背景下。