Suppr超能文献

数据挖掘专利抗体序列。

Data mining patented antibody sequences.

机构信息

Research and Development, Natural Antibody, Hamburg, Germany.

Department of Bio and Health Informatics, R&D, AstraZeneca, Cambridge, UK.

出版信息

MAbs. 2021 Jan-Dec;13(1):1892366. doi: 10.1080/19420862.2021.1892366.

Abstract

The patent literature should reflect the past 30 years of engineering efforts directed toward developing monoclonal antibody therapeutics. Such information is potentially valuable for rational antibody design. Patents, however, are designed not to convey scientific knowledge, but to provide legal protection. It is not obvious whether antibody information from patent documents, such as antibody sequences, is useful in conveying engineering know-how, rather than as a legal reference only. To assess the utility of patent data for therapeutic antibody engineering, we quantified the amount of antibody sequences in patents destined for medicinal purposes and how well they reflect the primary sequences of therapeutic antibodies in clinical use. We identified 16,526 patent families covering major jurisdictions (e.g., US Patent and Trademark Office (USPTO) and World Intellectual Property Organization) that contained antibody sequences. These families held 245,109 unique antibody chains (135,397 heavy chains and 109,712 light chains) that we compiled in our Patented Antibody Database (PAD, http://naturalantibody.com/pad). We find that antibodies make up a non-trivial proportion of all patent amino acid sequence depositions (e.g., 11% of USPTO Full Text database). Our analysis of the 16,526 families demonstrates that the volume of patent documents with antibody sequences is growing, with the majority of documents classified as containing antibodies for medicinal purposes. We further studied the 245,109 antibody chains from patent literature to reveal that they very well reflect the primary sequences of antibody therapeutics in clinical use. This suggests that the patent literature could serve as a reference for previous engineering efforts to improve rational antibody design.

摘要

专利文献应反映过去 30 年来开发单克隆抗体治疗药物的工程努力。这些信息对于理性抗体设计具有潜在价值。然而,专利的设计目的不是为了传达科学知识,而是为了提供法律保护。专利文件中的抗体信息(例如抗体序列)是否有助于传达工程技术诀窍,而不仅仅是作为法律参考,这并不明显。为了评估专利数据在治疗性抗体工程中的实用性,我们量化了专利文献中具有药用目的的抗体序列的数量,以及它们与临床使用的治疗性抗体的主要序列的吻合程度。我们确定了涵盖主要管辖区(例如美国专利商标局 (USPTO) 和世界知识产权组织)的 16526 个专利家族,其中包含抗体序列。这些家族拥有 245109 个独特的抗体链(135397 个重链和 109712 个轻链),我们将这些抗体链收录在我们的专利抗体数据库(PAD,http://naturalantibody.com/pad)中。我们发现,抗体在所有专利氨基酸序列记录中占据了相当大的比例(例如,USPTO 全文数据库的 11%)。我们对 16526 个家族的分析表明,含有抗体序列的专利文献数量在不断增加,其中大多数被归类为具有药用目的的抗体。我们进一步研究了专利文献中的 245109 个抗体链,发现它们与临床使用的抗体治疗药物的主要序列非常吻合。这表明专利文献可以作为参考,用于改进理性抗体设计的先前工程努力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f90a/7971238/6d5ca10019e8/KMAB_A_1892366_F0001_OC.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验