• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Automatic extraction of property norm-like data from large text corpora.

作者信息

Kelly Colin, Devereux Barry, Korhonen Anna

出版信息

Cogn Sci. 2014 May-Jun;38(4):638-82. doi: 10.1111/cogs.12091.

DOI:10.1111/cogs.12091
PMID:25019134
Abstract

Traditional methods for deriving property-based representations of concepts from text have focused on either extracting only a subset of possible relation types, such as hyponymy/hypernymy (e.g., car is-a vehicle) or meronymy/metonymy (e.g., car has wheels), or unspecified relations (e.g., car--petrol). We propose a system for the challenging task of automatic, large-scale acquisition of unconstrained, human-like property norms from large text corpora, and discuss the theoretical implications of such a system. We employ syntactic, semantic, and encyclopedic information to guide our extraction, yielding concept-relation-feature triples (e.g., car be fast, car require petrol, car cause pollution), which approximate property-based conceptual representations. Our novel method extracts candidate triples from parsed corpora (Wikipedia and the British National Corpus) using syntactically and grammatically motivated rules, then reweights triples with a linear combination of their frequency and four statistical metrics. We assess our system output in three ways: lexical comparison with norms derived from human-generated property norm data, direct evaluation by four human judges, and a semantic distance comparison with both WordNet similarity data and human-judged concept similarity ratings. Our system offers a viable and performant method of plausible triple extraction: Our lexical comparison shows comparable performance to the current state-of-the-art, while subsequent evaluations exhibit the human-like character of our generated properties.

摘要

相似文献

1
Automatic extraction of property norm-like data from large text corpora.
Cogn Sci. 2014 May-Jun;38(4):638-82. doi: 10.1111/cogs.12091.
2
A practical primer on processing semantic property norm data.语义属性规范数据处理实用入门指南。
Cogn Process. 2020 Nov;21(4):587-599. doi: 10.1007/s10339-019-00939-6. Epub 2019 Nov 25.
3
Similarity Judgment Within and Across Categories: A Comprehensive Model Comparison.范畴内和范畴间相似性判断:综合模型比较
Cogn Sci. 2021 Aug;45(8):e13030. doi: 10.1111/cogs.13030.
4
The role of corpus size and syntax in deriving lexico-semantic representations for a wide range of concepts.语料库规模和句法在推导广泛概念的词汇语义表征中的作用。
Q J Exp Psychol (Hove). 2015;68(8):1643-64. doi: 10.1080/17470218.2014.994098. Epub 2015 Feb 26.
5
Context Matters: Recovering Human Semantic Structure from Machine Learning Analysis of Large-Scale Text Corpora.语境至关重要:从大规模文本语料库的机器学习分析中恢复人类语义结构。
Cogn Sci. 2022 Feb;46(2):e13085. doi: 10.1111/cogs.13085.
6
Computational methods to extract meaning from text and advance theories of human cognition.从文本中提取意义并推进人类认知理论的计算方法。
Top Cogn Sci. 2011 Jan;3(1):3-17. doi: 10.1111/j.1756-8765.2010.01117.x. Epub 2010 Sep 7.
7
tESA: a distributional measure for calculating semantic relatedness.tESA:一种用于计算语义相关性的分布度量。
J Biomed Semantics. 2016 Dec 28;7(1):67. doi: 10.1186/s13326-016-0109-6.
8
Retrofitting Concept Vector Representations of Medical Concepts to Improve Estimates of Semantic Similarity and Relatedness.改造医学概念的向量表示以改进语义相似性和相关性的估计。
Stud Health Technol Inform. 2017;245:657-661.
9
Linked open data-based framework for automatic biomedical ontology generation.基于链接开放数据的自动生物医学本体生成框架。
BMC Bioinformatics. 2018 Sep 10;19(1):319. doi: 10.1186/s12859-018-2339-3.
10
Automatic lexicon acquisition for a medical cross-language information retrieval system.用于医学跨语言信息检索系统的自动词汇获取
Stud Health Technol Inform. 2005;116:829-34.

引用本文的文献

1
Semantic projection recovers rich human knowledge of multiple object features from word embeddings.语义投射从词嵌入中恢复了人类对多个对象特征的丰富知识。
Nat Hum Behav. 2022 Jul;6(7):975-987. doi: 10.1038/s41562-022-01316-8. Epub 2022 Apr 14.
2
The Centre for Speech, Language and the Brain (CSLB) concept property norms.言语、语言与大脑中心(CSLB)概念属性规范。
Behav Res Methods. 2014 Dec;46(4):1119-27. doi: 10.3758/s13428-013-0420-4.