Subramanian Sandeep, Ganapathiraju Madhavi K
Language Technologies Institute, Carnegie Mellon University.
Department of Biomedical Informatics, and Intelligent Systems Program, University of Pittsburgh.
Data (Basel). 2017 Dec;2(4). doi: 10.3390/data2040038. Epub 2017 Nov 21.
Bio-molecular reagents like antibodies required in experimental biology are expensive and their effectiveness, among other things, is critical to the success of the experiment. Although such resources are sometimes donated by one investigator to another through personal communication between the two, there is no previous study to our knowledge on the extent of such donations, nor a central platform that directs resource seekers to donors. In this paper, we describe, to our knowledge, a first attempt at building a web-portal titled Antibody Exchange (or more general 'Bio-Resource Exchange') that attempts to bridge this gap between resource seekers and donors in the domain of experimental biology. Users on this portal can request for or donate antibodies, cell-lines and DNA Constructs. This resource could also serve as a crowd-sourced database of resources for experimental biology. Further, we also studied the extent of antibody donations by mining the acknowledgement sections of scientific articles. Specifically, we extracted the name of the donor, his/her affiliation and the name of the antibody for every donation by parsing the acknowledgements sections of articles. To extract annotations at this level, we adopted two approaches - a rule based algorithm and a bootstrapped pattern learning algorithm. The algorithms extracted donor names, affiliations and antibody names with average accuracies of 57% and 62% respectively. We also created a dataset of 50 expert-annotated acknowledgements sections that will serve as a gold standard dataset to evaluate extraction algorithms in the future.
实验生物学所需的生物分子试剂,如抗体,价格昂贵,而且其有效性对实验成功至关重要。尽管有时一位研究人员会通过两人之间的私人交流将此类资源捐赠给另一位研究人员,但据我们所知,此前尚无关于此类捐赠范围的研究,也没有一个将资源寻求者引向捐赠者的中央平台。在本文中,据我们所知,我们首次尝试构建一个名为“抗体交换”(或更通用的“生物资源交换”)的门户网站,试图弥合实验生物学领域中资源寻求者与捐赠者之间的差距。该门户网站的用户可以请求或捐赠抗体、细胞系和DNA构建体。这个资源也可以作为实验生物学资源的众包数据库。此外,我们还通过挖掘科学文章的致谢部分来研究抗体捐赠的范围。具体来说,我们通过解析文章的致谢部分,提取每次捐赠的捐赠者姓名、所属机构和抗体名称。为了在这个层面提取注释,我们采用了两种方法——基于规则的算法和自训练模式学习算法。这两种算法提取捐赠者姓名、所属机构和抗体名称的平均准确率分别为57%和62%。我们还创建了一个由50个专家注释的致谢部分组成的数据集,该数据集将作为未来评估提取算法的黄金标准数据集。