Florida Museum of Natural History, University of Florida, Gainesville, Florida, United States of America.
PLoS One. 2018 Dec 19;13(12):e0207636. doi: 10.1371/journal.pone.0207636. eCollection 2018.
Recent changes in institutional cyberinfrastructure and collections data storage methods have dramatically improved accessibility of specimen-based data through the use of digital databases and data aggregators. This analysis of digitized fish collections in the U.S. demonstrates how information from data aggregators, in this case iDigBio, can be extracted and analyzed. Data from U.S. institutional fish collections in iDigBio were explored through a strictly programmatic approach using the ridigbio package and fishfindR web application. iDigBio facilitates the aggregation of collections data on a purely voluntary fashion that requires collection staff to consent to sharing of their data. Not all collections are sharing their data with iDigBio, but the data harvested from 38 of the 143 known fish collections in the U.S. that are in iDigBio account for the majority of fish specimens housed in U.S. collections. In the 22 years since publication of the last survey providing information on these 38 collections, 1,219,168 specimen records (lots), 15,225,744 specimens, 3,192 primary types, and 32,868 records of secondary types have been added. This is an increase of 65.1% in the number of cataloged records and an increase of 56.1% in the number of specimens. In addition to providing specimen-based data for research, education, and various outreach activities, data that are accessible via data aggregators can be used to develop accurate, up-to-date reports of information on institutional collections. Such reports present collections data in an organized and accessible fashion and can guide targeted efforts by collections personnel to meet discipline-specific needs and make data more transparent to downstream users. Data from this survey will be updated and published regularly in a dynamic web application that will aid collections staff in communicating collections value while simultaneously giving stakeholders a way to explore collections holdings as they relate to the institutions in which they are housed. It is through this resource that collections will be able to leverage their data against those of similar collections to aid in the procurement of financial and institutional support.
最近,机构网络基础设施和馆藏数据存储方法的变化,通过使用数字数据库和数据聚合器,极大地提高了基于标本数据的可访问性。本研究分析了美国数字化鱼类馆藏,展示了如何从数据聚合器(在此例中为 iDigBio)中提取和分析信息。通过严格的编程方法,使用 ridigbio 包和 fishfindR 网络应用程序,探索了 iDigBio 中美国机构鱼类馆藏的数据。iDigBio 以纯粹自愿的方式促进了馆藏数据的聚合,要求馆藏人员同意共享其数据。并非所有馆藏都在与 iDigBio 共享数据,但从美国已知的 143 个鱼类馆藏中的 38 个馆藏中采集的数据,涵盖了美国馆藏中存放的大多数鱼类标本。自上一次调查公布这些 38 个馆藏的信息以来,已经过去了 22 年,在此期间,共增加了 1,219,168 个标本记录(批次)、15,225,744 个标本、3,192 个原始类型和 32,868 个二级类型记录。这是编目记录数量增加了 65.1%,标本数量增加了 56.1%。除了为研究、教育和各种外展活动提供基于标本的数据外,通过数据聚合器可获取的数据还可用于生成有关机构馆藏信息的准确、最新报告。此类报告以有组织且易于访问的方式呈现馆藏数据,并可指导馆藏人员有针对性地努力满足特定学科的需求,使数据对下游用户更加透明。该调查的数据将定期在动态网络应用程序中更新和发布,该应用程序将帮助馆藏人员在宣传馆藏价值的同时,为利益相关者提供一种探索馆藏藏品的方式,使其与馆藏所在机构相关联。通过该资源,馆藏将能够利用其数据与类似馆藏的数据进行对比,以帮助获取财务和机构支持。