Suppr超能文献

一个将中国专利与中国人口普查公司联系起来的数据库。

A database linking Chinese patents to China's census firms.

机构信息

Tilburg University, 5037 AB Tilburg, Tilburg, 5000 LE, Netherlands.

University of Colorado, Leeds School of Business, 995 Regent Dr, Boulder, CO 80309, USA.

出版信息

Sci Data. 2018 Mar 27;5:180042. doi: 10.1038/sdata.2018.42.

Abstract

To meet researchers' increasing interest in the fast growing innovation activities taking place in China, we match patents filed with China's State Intellectual Property Office to firms covered in China's Census. China has experienced a strong growth in patent filings over the past two decades, and has since 2011 become the world's top patent filing country. China's Census database covers about one million unique manufacturing firms from 1998-2009, representing the broad Chinese economy. We design data parsing and pre-processing routines to clean and stem firm and assignee names, create a matching algorithm that fits with our data and maintains a balance between matching accuracy and workload of manual check, and implement a systematic manual check process to filter out false positives generated from computerized matching. Our project generates 1,113,588 matches for the Census firms, among which 849,647 patents are uniquely matched. By creating the patent-firm linked dataset, we hope to reduce duplicative effort and encourage more research to better understand China's fast changing innovation landscape.

摘要

为满足研究人员对中国快速增长的创新活动日益增长的兴趣,我们将中国国家知识产权局提交的专利与中国普查涵盖的公司相匹配。中国在过去二十年中经历了专利申请的强劲增长,自 2011 年以来已成为世界上专利申请最多的国家。中国普查数据库涵盖了 1998-2009 年约 100 万家独特的制造企业,代表了广泛的中国经济。我们设计了数据解析和预处理例程来清理和词干公司和受让人的名称,创建一个与我们的数据相匹配的匹配算法,并在匹配准确性和手动检查工作量之间取得平衡,并实施系统的手动检查过程来过滤掉计算机匹配产生的误报。我们的项目为普查公司生成了 1113588 个匹配项,其中 849647 项专利是唯一匹配的。通过创建专利公司关联数据集,我们希望减少重复工作,并鼓励更多的研究来更好地了解中国快速变化的创新格局。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ce0/5956277/9d528c86915a/sdata201842-f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验