Arnold Andrew, Cohen William W
WASA. 2009 Jan 1;5682:541-550. doi: 10.1007/978-3-642-03417-6_53.
In this paper we explore the usefulness of various types of publication-related metadata, such as citation networks and curated databases, for the task of identifying genes in academic biomedical publications. Specifically, we examine whether knowing something about which genes an author has previously written about, combined with information about previous coauthors and citations, can help us predict which new genes the author is likely to write about in the future. Framed in this way, the problem becomes one of predicting links between authors and genes in the publication network. We show that this solely social-network based link prediction technique outperforms various baselines, including those relying only on non-social biological information.
在本文中,我们探讨了各种与出版物相关的元数据(如引文网络和经过整理的数据库)对于在学术生物医学出版物中识别基因任务的有用性。具体而言,我们研究了了解作者之前撰写过哪些基因,再结合关于之前共同作者和引文的信息,是否能帮助我们预测作者未来可能会撰写哪些新基因。以这种方式构建问题,它就变成了预测出版物网络中作者与基因之间联系的问题。我们表明,这种仅基于社交网络的链接预测技术优于各种基线方法,包括那些仅依赖非社交生物信息的方法。