Populating a multilingual ontology of proper names from open sources

Agata Savary; Leszek Manicki; Ma?gorzata Baron

首页> 外文期刊>Journal of Language Modelling >Populating a multilingual ontology of proper names from open sources

【24h】

Populating a multilingual ontology of proper names from open sources

机译：从开源中填充专有名称的多语言本体

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Even if proper names play a central role in natural language processing (NLP) applications they are still under-represented in lexicons, annotated corpora, and other resources dedicated to text processing.? One of the main challenges is both the prevalence and the dynamicity of proper names. At the same time, large and regularly-updated knowledge sources containing partially-structured data, such as Wikipedia or GeoNames, are publicly available and contain large numbers of proper names. We present a method for a semi-automatic enrichment of Prolexbase, an existing multilingual ontology of proper names dedicated to natural language processing, with data extracted from these open sources in three languages: Polish, English and French. Fine-grained data extraction and integration procedures allow the user to enrich previous contents of Prolexbase with new incoming data. All data are manually validated and available under an open licence.

机译：即使专有名称在自然语言处理（NLP）应用程序中起着核心作用，它们在词典，带注解的语料库和其他专用于文本处理的资源中的代表性仍然不足。主要挑战之一是专有名称的普遍性和动态性。同时，包含部分结构化数据的大型且定期更新的知识源（例如Wikipedia或GeoNames）是公开可用的，并且包含大量专有名称。我们提出了一种Prolexbase的半自动增值方法，Prolexbase是专门用于自然语言处理的现有专有名称的多语言本体，其数据来自以下三种语言的开放源代码：波兰语，英语和法语。细粒度的数据提取和集成过程使用户可以使用新的传入数据丰富Prolexbase的先前内容。所有数据均经过手动验证，并在公开许可下可用。

著录项

来源
《Journal of Language Modelling》 |2013年第2期|共37页
作者
Agata Savary; Leszek Manicki; Ma?gorzata Baron;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词
入库时间 2022-08-18 16:38:41

相似文献

外文文献
中文文献
专利

1. Reuse of termino-ontological resources and text corpora for building a multilingual domain ontology: An application to Alzheimer's disease [J] . Khadim Drame, Gayo Diallo, Fleur Delva, Journal of biomedical informatics. . 2014,第Null期

机译：重复使用术语本体资源和文本语料库来构建多语言领域本体：在阿尔茨海默氏病中的应用
2. Reuse of termino-ontological resources and text corpora for building a multilingual domain ontology: An application to Alzheimer's disease [J] . Khadim Drame, Gayo Diallo, Fleur Delva, Journal of biomedical informatics. . 2014,第Null期

机译：用于构建多语言域本体的术语本体资源和文本语料库的重用：对阿尔茨海默病的应用
3. Symbiosis of thesaurus, domain expert and reference sources in designing a framework for the construction of a multilingual ontology for Islamic Porta [J] . Juhana Salim, Shahrul Azman Mohamad Noah, Siti Farhana Mohamad Hashim International Journal on Electrical Engineering and Informatics . 2012,第1期

机译：同义词库，领域专家和参考资料的共生，为伊斯兰门的多语言本体设计框架的设计
4. Description of a Multilingual Database of Proper Names [C] . Thierry Grass, Denis Maurel, Odile Piton International Conference on Portugal for Natural Language Processing . 2002

机译：描述适当名称的多语言数据库
5. From Preprocessing to Named Entity Recognition, Linking and Clustering in Multilingual, Cross-Lingual, High-Low Resources Settings [D] . Zirikly, Ayah. 2018

机译：从预处理到命名实体识别，多语言，跨语言，高低资源设置中的链接和聚类
6. The Multilingual Naming Test in Alzheimer’s Disease: Clues to the Origin of Naming Impairments [O] . Iva Ivanova, David P. Salmon, Tamar H. Gollan -1

机译：阿尔茨海默病的多语言命名试验：命名损伤起源的线索
7. Multilingual Ontology of Proper Names [O] . Krstev Cvetana, Vitas Duško, Maurel Denis, 2005

机译：专有名称的多语言本体论

Populating a multilingual ontology of proper names from open sources

摘要

著录项

相似文献

相关主题

期刊订阅