Metcalf Kathryne
University of California San Diego, San Diego, CA, USA.
Soc Stud Sci. 2025 Apr;55(2):209-237. doi: 10.1177/03063127241288223. Epub 2024 Oct 7.
The opaque relationship between biology and behavior is an intractable problem for psychiatry, and it increasingly challenges longstanding diagnostic categorizations. While various big data sciences have been repeatedly deployed as potential solutions, they have so far complicated more than they have managed to disentangle. Attending to , this article proposes one reason why this is the case: Datasets have to instantiate clinical categories in order to make biological sense of them, and they do so in different ways. Here, I use mixed methods to examine the role of the reuse of big data in recent genomic research on autism spectrum disorder (ASD). I show how divergent regimes of psychiatric categorization are innately encoded within commonly used datasets from MSSNG and 23andMe, contributing to a rippling disjuncture in the accounts of autism that this body of research has produced. Beyond the specific complications this dynamic introduces for the category of autism, this paper argues for the necessity of critical attention to the role of dataset reuse and recombination across human genomics and beyond.
生物学与行为之间的不透明关系是精神病学中一个棘手的问题,它日益挑战着长期以来的诊断分类。尽管各种大数据科学已被反复用作潜在的解决方案,但迄今为止,它们带来的复杂性超过了其所能厘清的程度。考虑到这一点,本文提出了一个原因来解释为何会如此:数据集必须实例化临床类别才能从生物学角度理解它们,而它们以不同的方式做到这一点。在这里,我使用混合方法来研究大数据再利用在近期自闭症谱系障碍(ASD)基因组研究中的作用。我展示了精神病学分类的不同体系如何内在地编码在来自MSSNG和23andMe的常用数据集中,导致了这一研究群体所产生的自闭症描述中出现连锁脱节。除了这种动态变化给自闭症类别带来的具体复杂性之外,本文还主张必须批判性地关注数据集再利用和重组在人类基因组学及其他领域中的作用。