• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用文本挖掘技术提取与时间相关的表达方式及其在希伯来语中的应用。

Extraction of time-related expressions using text mining with application to Hebrew.

机构信息

Dept. of Computer Science, Jerusalem College of Technology-Lev Academic Center, Jerusalem, Israel.

Dept. of Computer Science, Bar-Ilan University, Ramat-Gan, Israel.

出版信息

PLoS One. 2024 Feb 23;19(2):e0293196. doi: 10.1371/journal.pone.0293196. eCollection 2024.

DOI:10.1371/journal.pone.0293196
PMID:38394097
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10889890/
Abstract

In this research, we extract time-related expressions from a rabbinic text in a semi-automatic manner. These expressions usually appear next to rabbinic references (name / nickname / acronym / book-name). The first step toward our goal is to find all the expressions near references in the corpus. However, not all of the phrases around the references are time-related expressions. Therefore, these phrases are initially considered to be potential time-related expressions. To extract the time-related expressions, we formulate two new statistical functions, and we use screening and heuristic methods. We tested these statistical functions, grammatical screenings, and heuristic methods on a corpus containing responsa documents. In this corpus, many rabbinic citations are known and marked. The statistical functions and the screening methods filtered the potential time-related expressions and reduced 99.88% of the initial expressions (from 484,681 to 575).

摘要

在这项研究中,我们以半自动的方式从拉比文本中提取与时间相关的表达式。这些表达式通常出现在拉比参考文献(姓名/昵称/首字母缩略词/书名)旁边。我们目标的第一步是在语料库中找到所有参考文献附近的表达式。然而,并非所有参考文献周围的短语都是与时间相关的表达式。因此,这些短语最初被视为潜在的与时间相关的表达式。为了提取与时间相关的表达式,我们制定了两个新的统计函数,并使用筛选和启发式方法。我们在包含答复文件的语料库上测试了这些统计函数、语法筛选和启发式方法。在这个语料库中,许多拉比引文是已知的并标记了。统计函数和筛选方法过滤了潜在的与时间相关的表达式,并将初始表达式的 99.88%减少到 575 个(从 484681 减少到 575)。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/68a7/10889890/6b8820ebf54e/pone.0293196.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/68a7/10889890/84a96ce92ef5/pone.0293196.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/68a7/10889890/5f49cb5dd576/pone.0293196.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/68a7/10889890/aa62f594a2b7/pone.0293196.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/68a7/10889890/fbca057e2d33/pone.0293196.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/68a7/10889890/c192a912fd09/pone.0293196.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/68a7/10889890/da566ad4c213/pone.0293196.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/68a7/10889890/6a2f08a85df4/pone.0293196.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/68a7/10889890/81c9c32f3871/pone.0293196.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/68a7/10889890/6b8820ebf54e/pone.0293196.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/68a7/10889890/84a96ce92ef5/pone.0293196.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/68a7/10889890/5f49cb5dd576/pone.0293196.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/68a7/10889890/aa62f594a2b7/pone.0293196.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/68a7/10889890/fbca057e2d33/pone.0293196.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/68a7/10889890/c192a912fd09/pone.0293196.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/68a7/10889890/da566ad4c213/pone.0293196.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/68a7/10889890/6a2f08a85df4/pone.0293196.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/68a7/10889890/81c9c32f3871/pone.0293196.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/68a7/10889890/6b8820ebf54e/pone.0293196.g009.jpg

相似文献

1
Extraction of time-related expressions using text mining with application to Hebrew.使用文本挖掘技术提取与时间相关的表达方式及其在希伯来语中的应用。
PLoS One. 2024 Feb 23;19(2):e0293196. doi: 10.1371/journal.pone.0293196. eCollection 2024.
2
A pilot study of a heuristic algorithm for novel template identification from VA electronic medical record text.一项关于从退伍军人事务部电子病历文本中识别新型模板的启发式算法的试点研究。
J Biomed Inform. 2017 Jul;71S:S68-S76. doi: 10.1016/j.jbi.2016.07.019. Epub 2016 Aug 3.
3
Characterizing the Google Books Corpus: Strong Limits to Inferences of Socio-Cultural and Linguistic Evolution.描述谷歌图书语料库:社会文化与语言演变推断的严格限制
PLoS One. 2015 Oct 7;10(10):e0137041. doi: 10.1371/journal.pone.0137041. eCollection 2015.
4
Using text mining techniques to extract phenotypic information from the PhenoCHF corpus.使用文本挖掘技术从PhenoCHF语料库中提取表型信息。
BMC Med Inform Decis Mak. 2015;15 Suppl 2(Suppl 2):S3. doi: 10.1186/1472-6947-15-S2-S3. Epub 2015 Jun 15.
5
Meta-Heuristic Feature Optimization for ontology-based data security in a campus workplace with robotic assistance.基于本体的校园工作场所机器人辅助数据安全的启发式特征优化。
Work. 2021;68(3):913-922. doi: 10.3233/WOR-203425.
6
Text mining in mosquito-borne disease: A systematic review.基于文本挖掘的蚊媒传染病研究: 系统综述
Acta Trop. 2022 Jul;231:106447. doi: 10.1016/j.actatropica.2022.106447. Epub 2022 Apr 14.
7
Evaluation of text mining to reduce screening workload for injury-focused systematic reviews.文本挖掘在以损伤为重点的系统评价中的应用评价,以减少筛选工作量。
Inj Prev. 2020 Feb;26(1):55-60. doi: 10.1136/injuryprev-2019-043247. Epub 2019 Aug 26.
8
[The method and application to construct experience recommendation platform of acupuncture ancient books based on data mining technology].基于数据挖掘技术构建针灸古籍经验推荐平台的方法及应用
Zhongguo Zhen Jiu. 2017 Jul 12;37(7):768-772. doi: 10.13703/j.0255-2930.2017.07.021.
9
Automatic extraction of angiogenesis bioprocess from text.自动从文本中提取血管生成生物过程。
Bioinformatics. 2011 Oct 1;27(19):2730-7. doi: 10.1093/bioinformatics/btr460. Epub 2011 Aug 5.
10
Redundancy in electronic health record corpora: analysis, impact on text mining performance and mitigation strategies.电子健康记录语料库中的冗余:分析、对文本挖掘性能的影响和缓解策略。
BMC Bioinformatics. 2013 Jan 16;14:10. doi: 10.1186/1471-2105-14-10.

本文引用的文献

1
End-to-end clinical temporal information extraction with multi-head attention.基于多头注意力机制的端到端临床时间信息提取
Proc Conf Assoc Comput Linguist Meet. 2023 Jul;2023:313-319.
2
Public opinion about the UK government during COVID-19 and implications for public health: A topic modeling analysis of open-ended survey response data.公众对英国政府在 COVID-19 期间的看法及其对公共卫生的影响:对开放式调查回复数据的主题建模分析。
PLoS One. 2022 Apr 14;17(4):e0264134. doi: 10.1371/journal.pone.0264134. eCollection 2022.
3
Semantic text mining in early drug discovery for type 2 diabetes.
2 型糖尿病早期药物发现中的语义文本挖掘。
PLoS One. 2020 Jun 15;15(6):e0233956. doi: 10.1371/journal.pone.0233956. eCollection 2020.
4
Factors influencing plagiarism in higher education: A comparison of German and Slovene students.影响高等教育中剽窃行为的因素:德国和斯洛文尼亚学生的比较。
PLoS One. 2018 Aug 10;13(8):e0202252. doi: 10.1371/journal.pone.0202252. eCollection 2018.