Vancouver Prostate Centre, University of British Columbia, 2660 Oak Street, Vancouver, BC V6H 3Z6, Canada.
Database (Oxford). 2023 Apr 3;2023. doi: 10.1093/database/baad016.
The isolation of proteins of interest from cell lysates is an integral step to study protein structure and function. Liquid chromatography is a technique commonly used for protein purification, where the separation is performed by exploiting the differences in physical and chemical characteristics of proteins. The complex nature of proteins requires researchers to carefully choose buffers that maintain stability and activity of the protein while also allowing for appropriate interaction with chromatography columns. To choose the proper buffer, biochemists often search for reports of successful purification in the literature; however, they often encounter roadblocks such as lack of accessibility to journals, non-exhaustive specification of components and unfamiliar naming conventions. To overcome such issues, we present PurificationDB (https://purificationdatabase.herokuapp.com/), an open-access and user-friendly knowledge base that contains 4732 curated and standardized entries of protein purification conditions. Buffer specifications were derived from the literature using named-entity recognition techniques developed using common nomenclature provided by protein biochemists. PurificationDB also incorporates information associated with well-known protein databases: Protein Data Bank and UniProt. PurificationDB facilitates easy access to data on protein purification techniques and contributes to the growing effort of creating open resources that organize experimental conditions and data for improved access and analysis. Database URL https://purificationdatabase.herokuapp.com/.
从细胞裂解物中分离感兴趣的蛋白质是研究蛋白质结构和功能的一个重要步骤。液相色谱是一种常用于蛋白质纯化的技术,通过利用蛋白质物理和化学特性的差异来进行分离。蛋白质的复杂性质要求研究人员仔细选择缓冲液,以保持蛋白质的稳定性和活性,同时允许与色谱柱进行适当的相互作用。为了选择合适的缓冲液,生物化学家通常在文献中寻找成功纯化的报告;然而,他们经常遇到一些障碍,如无法访问期刊、缓冲液成分描述不详尽以及不熟悉命名约定。为了克服这些问题,我们提供了 PurificationDB(https://purificationdatabase.herokuapp.com/),这是一个开放获取且用户友好的知识库,其中包含了 4732 条经过精心整理和标准化的蛋白质纯化条件条目。缓冲液规格是使用基于蛋白质生物化学家提供的常见命名约定的命名实体识别技术从文献中提取的。PurificationDB 还整合了与知名蛋白质数据库(蛋白质数据库和 UniProt)相关的信息。PurificationDB 方便用户轻松访问蛋白质纯化技术数据,并为创建用于改善访问和分析的开放资源的不断努力做出贡献。数据库网址:https://purificationdatabase.herokuapp.com/。