Suppr超能文献

来自4200万个非结构化招聘信息的职业模型。

Occupational models from 42 million unstructured job postings.

作者信息

Dixon Nile, Goggins Marcelle, Ho Ethan, Howison Mark, Long Joe, Northcott Emma, Shen Karen, Yeats Carrie

机构信息

Research Improving People's Lives, 1 Park Row, Suite 401, Providence, RI 02903, USA.

National Association of State Workforce Agencies, 444 N. Capitol Street NW, Suite 300, Washington, DC 20001, USA.

出版信息

Patterns (N Y). 2023 May 22;4(7):100757. doi: 10.1016/j.patter.2023.100757. eCollection 2023 Jul 14.

Abstract

Structuring jobs into occupations is the first step for analysis tasks in many fields of research, including economics and public health, as well as for practical applications like matching job seekers to available jobs. We present a data resource, derived with natural language processing techniques from over 42 million unstructured job postings in the National Labor Exchange, that empirically models the associations between occupation codes (estimated initially by the Standardized Occupation Coding for Computer-assisted Epidemiological Research method), skill keywords, job titles, and full-text job descriptions in the United States during the years 2019 and 2021. We model the probability that a job title is associated with an occupation code and that a job description is associated with skill keywords and occupation codes. Our models are openly available in the python package, which can assign occupation codes to job titles, parse skills from and assign occupation codes to job postings and resumes, and estimate occupational similarity among job postings, resumes, and occupation codes.

摘要

将工作结构化到职业类别中,是包括经济学和公共卫生在内的许多研究领域进行分析任务的第一步,也是诸如将求职者与现有工作进行匹配等实际应用的第一步。我们展示了一种数据资源,它是通过自然语言处理技术从国家劳动力交易所的4200多万个非结构化招聘信息中提取出来的,该数据资源对2019年至2021年期间美国职业代码(最初由计算机辅助流行病学研究的标准化职业编码方法估算)、技能关键词、职位名称和全文招聘信息之间的关联进行了实证建模。我们对职位名称与职业代码相关联以及招聘信息与技能关键词和职业代码相关联的概率进行建模。我们的模型在Python包中公开可用,该包可以为职位名称分配职业代码,从招聘信息和简历中解析技能并为其分配职业代码,还可以估算招聘信息、简历和职业代码之间的职业相似度。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7264/10382938/a49fabacf59f/gr1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验