• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

地理解析中缺少了什么?

What's missing in geographical parsing?

作者信息

Gritta Milan, Pilehvar Mohammad Taher, Limsopatham Nut, Collier Nigel

机构信息

Language Technology Lab (LTL), Department of Theoretical and Applied Linguistics (DTAL), University of Cambridge, 9 West Road, Cambridge, CB3 9DP UK.

出版信息

Lang Resour Eval. 2018;52(2):603-623. doi: 10.1007/s10579-017-9385-8. Epub 2017 Mar 7.

DOI:10.1007/s10579-017-9385-8
PMID:31258456
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6560650/
Abstract

Geographical data can be obtained by converting place names from free-format text into geographical coordinates. The ability to geo-locate events in textual reports represents a valuable source of information in many real-world applications such as emergency responses, real-time social media geographical event analysis, understanding location instructions in auto-response systems and more. However, geoparsing is still widely regarded as a challenge because of domain language diversity, place name ambiguity, metonymic language and limited leveraging of context as we show in our analysis. Results to date, whilst promising, are on laboratory data and unlike in wider NLP are often not cross-compared. In this study, we evaluate and analyse the performance of a number of leading geoparsers on a number of corpora and highlight the challenges in detail. We also publish an automatically geotagged Wikipedia corpus to alleviate the dearth of (open source) corpora in this domain.

摘要

通过将自由格式文本中的地名转换为地理坐标,可以获取地理数据。在文本报告中对事件进行地理定位的能力在许多实际应用中都是宝贵的信息来源,如应急响应、实时社交媒体地理事件分析、理解自动回复系统中的位置指示等等。然而,正如我们在分析中所展示的,由于领域语言的多样性、地名的模糊性、转喻语言以及对上下文的利用有限,地理解析仍被广泛视为一项挑战。迄今为止的结果虽然很有前景,但都是基于实验室数据,而且与更广泛的自然语言处理不同,这些结果往往没有进行交叉比较。在本研究中,我们评估并分析了一些领先的地理解析器在多个语料库上的性能,并详细突出了其中的挑战。我们还发布了一个自动地理标记的维基百科语料库,以缓解该领域(开源)语料库的匮乏。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0170/6560650/f91d34632414/10579_2017_9385_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0170/6560650/4ac103308b66/10579_2017_9385_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0170/6560650/b31cd86a6256/10579_2017_9385_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0170/6560650/d26bd213440d/10579_2017_9385_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0170/6560650/0adc3ba5eabc/10579_2017_9385_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0170/6560650/f91d34632414/10579_2017_9385_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0170/6560650/4ac103308b66/10579_2017_9385_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0170/6560650/b31cd86a6256/10579_2017_9385_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0170/6560650/d26bd213440d/10579_2017_9385_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0170/6560650/0adc3ba5eabc/10579_2017_9385_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0170/6560650/f91d34632414/10579_2017_9385_Fig5_HTML.jpg

相似文献

1
What's missing in geographical parsing?地理解析中缺少了什么?
Lang Resour Eval. 2018;52(2):603-623. doi: 10.1007/s10579-017-9385-8. Epub 2017 Mar 7.
2
A pragmatic guide to geoparsing evaluation: Toponyms, Named Entity Recognition and pragmatics.地理解析评估实用指南:地名、命名实体识别与语用学
Lang Resour Eval. 2020;54(3):683-712. doi: 10.1007/s10579-019-09475-3. Epub 2019 Sep 19.
3
Geographical Topics Learning of Geo-Tagged Social Images.地理标记社交图像的地理主题学习。
IEEE Trans Cybern. 2016 Mar;46(3):744-55. doi: 10.1109/TCYB.2015.2414489. Epub 2015 Apr 7.
4
Using Twitter Data to Monitor Natural Disaster Social Dynamics: A Recurrent Neural Network Approach with Word Embeddings and Kernel Density Estimation.利用 Twitter 数据监测自然灾害社会动态:基于词嵌入和核密度估计的递归神经网络方法。
Sensors (Basel). 2019 Apr 11;19(7):1746. doi: 10.3390/s19071746.
5
A comparison of word embeddings for the biomedical natural language processing.生物医学自然语言处理中词嵌入的比较。
J Biomed Inform. 2018 Nov;87:12-20. doi: 10.1016/j.jbi.2018.09.008. Epub 2018 Sep 12.
6
Automatically Detecting Failures in Natural Language Processing Tools for Online Community Text.自动检测在线社区文本自然语言处理工具中的故障。
J Med Internet Res. 2015 Aug 31;17(8):e212. doi: 10.2196/jmir.4612.
7
An annotated corpus with nanomedicine and pharmacokinetic parameters.一个带有纳米医学和药代动力学参数的注释语料库。
Int J Nanomedicine. 2017 Oct 12;12:7519-7527. doi: 10.2147/IJN.S137117. eCollection 2017.
8
A systematic review of natural language processing for classification tasks in the field of incident reporting and adverse event analysis.自然语言处理在事件报告和不良事件分析领域分类任务中的系统评价
Int J Med Inform. 2019 Dec;132:103971. doi: 10.1016/j.ijmedinf.2019.103971. Epub 2019 Oct 5.
9
MapAffil: A Bibliographic Tool for Mapping Author Affiliation Strings to Cities and Their Geocodes Worldwide.MapAffil:一种用于将作者所属机构字符串映射到全球城市及其地理编码的文献工具。
Dlib Mag. 2015 Nov-Dec;21(11-12). doi: 10.1045/november2015-torvik.
10
Natural Language Processing and Its Implications for the Future of Medication Safety: A Narrative Review of Recent Advances and Challenges.自然语言处理及其对药物安全未来的影响:对近期进展和挑战的叙述性综述。
Pharmacotherapy. 2018 Aug;38(8):822-841. doi: 10.1002/phar.2151. Epub 2018 Jul 22.

引用本文的文献

1
Detecting informal green, blue, and street physical activity spaces in the city using geotagged sports-related Twitter tweets.利用带有地理标记的与体育相关的推特推文来检测城市中非正式的绿色、蓝色和街道体育活动空间。
Front Sociol. 2023 May 5;8:1125343. doi: 10.3389/fsoc.2023.1125343. eCollection 2023.
2
Exploring Descriptions of Movement Through Geovisual Analytics.通过地理视觉分析探索运动描述
KN J Cartogr Geogr Inf. 2022;72(1):5-27. doi: 10.1007/s42489-022-00098-3. Epub 2022 Feb 24.
3
Extracting and modeling geographic information from scientific articles.

本文引用的文献

1
Opinion: Reproducible research can still be wrong: adopting a prevention approach.观点:可重复的研究仍可能出错:采取预防措施。
Proc Natl Acad Sci U S A. 2015 Feb 10;112(6):1645-6. doi: 10.1073/pnas.1421412111.
2
Reproducible research in computational science.计算科学中的可重复性研究。
Science. 2011 Dec 2;334(6060):1226-7. doi: 10.1126/science.1213847.
3
Use of the Edinburgh geoparser for georeferencing digitized historical collections.利用爱丁堡地理解析器对数字化历史馆藏进行地理定位。
从科学文章中提取和建模地理信息。
PLoS One. 2021 Jan 6;16(1):e0244918. doi: 10.1371/journal.pone.0244918. eCollection 2021.
4
Application of natural language processing algorithms for extracting information from news articles in event-based surveillance.基于事件监测的新闻文章信息提取中自然语言处理算法的应用。
Can Commun Dis Rep. 2020 Jun 4;46(6):186-191. doi: 10.14745/ccdr.46i06a06.
5
A pragmatic guide to geoparsing evaluation: Toponyms, Named Entity Recognition and pragmatics.地理解析评估实用指南:地名、命名实体识别与语用学
Lang Resour Eval. 2020;54(3):683-712. doi: 10.1007/s10579-019-09475-3. Epub 2019 Sep 19.
6
Bi-directional Recurrent Neural Network Models for Geographic Location Extraction in Biomedical Literature.用于生物医学文献中地理位置提取的双向递归神经网络模型
Pac Symp Biocomput. 2019;24:100-111.
7
Augmenting geovisual analytics of social media data with heterogeneous information network mining-Cognitive plausibility assessment.社交媒体数据的地理可视化分析与异构信息网络挖掘的结合——认知可行性评估。
PLoS One. 2018 Dec 4;13(12):e0206906. doi: 10.1371/journal.pone.0206906. eCollection 2018.
Philos Trans A Math Phys Eng Sci. 2010 Aug 28;368(1925):3875-89. doi: 10.1098/rsta.2010.0149.