首页> 外文期刊>Expert systems with applications >Geographic Named Entity Recognition and Disambiguation in Mexican News using word embeddings
【24h】

Geographic Named Entity Recognition and Disambiguation in Mexican News using word embeddings

机译:使用Word Embeddings在墨西哥新闻中的地理名为实体识别和歧义

获取原文
获取原文并翻译 | 示例

摘要

In recent years, dense word embeddings for text representation have been widely used since they can model complex semantic and morphological characteristics of language, such as meaning in specific contexts and applications. Contrary to sparse representations, such as one-hot encoding or frequencies, word embeddings provide computational advantages and improvements on the results in many natural language processing tasks, similar to the automatic extraction of geospatial information. Computer systems capable of discovering geographic information from natural language involve a complex process called geoparsing. In this work, we explore the use of word embeddings for two NLP tasks: Geographic Named Entity Recognition and Geographic Entity Disambiguation, both as an effort to develop the first Mexican Geoparser. Our study shows that relationships between geographic and semantic spaces arise when we apply word embedding models over a corpus of documents in Mexican Spanish. Our models achieved high accuracy for geographic named entity recognition in Spanish.
机译:近年来,文本表示的密集词嵌入被广泛使用,因为它们可以模拟语言的复杂语义和形态特征,例如在特定环境和应用中的意义。与单个热编码或频率之类的稀疏表示相反,Word Embeddings提供了许多自然语言处理任务的结果,类似于地理空间信息的自动提取的结果提供计算优势和改进。能够从自然语言发现地理信息的计算机系统涉及一个名为Geoparsing的复杂过程。在这项工作中,我们探讨了两个NLP任务的Word Embeddings:地理命名实体识别和地理实体歧义,既努力开发第一个墨西哥地球化器。我们的研究表明,当我们在墨西哥西班牙语的文件中应用Word嵌入模型时出现了地理和语义空间之间的关系。我们的模型在西班牙语中实现了高精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号