EDRAK: Entity-Centric Data Resource for Arabic Knowledge

机译：Edrak：用于阿拉伯知识的实体/地区数据资源

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Online Arabic content is growing very rapidly, with unmatched growth in Arabic structured resources. Systems that perform standard Natural Language Processing (NLP) tasks such as Named Entity Disambiguation (NED) struggle to deliver decent quality due to the lack of rich Arabic entity repositories. In this paper, we introduce EDRAK, an automatically generated comprehensive Arabic entity-centric resource. EDRAK contains more than two million entities together with their Arabic names and contextual keyphrases. Manual evaluation confirmed the quality of the generated data. We are making EDRAK publicly available as a valuable resource to help advance research in Arabic NLP and IR tasks such as dictionary-based Named-Entity Recognition, entity classification, and entity summarization.

机译：在线阿拉伯语内容正在非常迅速增长，具有无与伦比的结构化资源的增长。执行标准自然语言处理（NLP）任务的系统，例如命名实体消歧（NED）斗争，以提供由于缺乏富含阿拉伯实体存储库而产生的体质质量。在本文中，我们介绍了Edrak，这是一个自动生成的综合阿拉伯实体为中心的资源。 Edrak包含超过200万个实体，以及他们的阿拉伯名字和上下文关键词。手动评估证实了所生成的数据的质量。我们正在将Edrak公开可用作有价值的资源，以帮助提前参加阿拉伯语NLP和IR任务的研究，例如基于字典的名称实体识别，实体分类和实体摘要。

著录项

来源
《Workshop on Arabic natural language processing》|2015年||共10页
会议地点
作者
Mohamed H. Gad-Elrab; Mohamed Amir Yosef; Gerhard Weikum;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算机软件;
关键词

相似文献

外文文献
中文文献
专利

1. Gaining additional value from secondary data resources: Using existing internal data and knowledge to create new company-centric resources [J] . Denise Carter Business Information Review . 2012,第3期

机译：从二级数据资源中获得附加价值：利用现有的内部数据和知识来创建新的以公司为中心的资源
2. Knowledge Dictionary for Information Extraction on the Arabic Text Data [J] . Wahyu Jauharis Saputra, Agus Arifin, Anny Yuniarti Makara Seri Teknologi . 2013,第2期

机译：阿拉伯文本数据信息提取知识词典
3. Towards Massive Data and Sparse Data in Adaptive Micro Open Educational Resource Recommendation: A Study on Semantic Knowledge Base Construction and Cold Start Problem [J] . Geng Sun, Tingru Cui, Ghassan Beydoun, Sustainability . 2017,第6期

机译：适应性微开放教育资源推荐中的海量数据和稀疏数据：语义知识库构建和冷启动问题研究
4. EDRAK: Entity-Centric Data Resource for Arabic Knowledge [C] . Mohamed H. Gad-Elrab, Mohamed Amir Yosef, Gerhard Weikum Workshop on Arabic natural language processing . 2015

机译：EDRAK：阿拉伯知识的以实体为中心的数据资源
5. An integrated framework for managing labour resources data in industrial construction projects: A Knowledge Discovery in Data (KDD) approach [D] . Hammad, Ahmed Mohamed 2009

机译：用于管理工业建筑项目中的劳动力资源数据的集成框架：数据知识发现（KDD）方法
6. The PlaNet Consortium: A Network of European Plant Databases Connecting Plant Genome Data in an Integrated Biological Knowledge Resource [O] . H. Schoof, R. Ernst, K. F. X. Mayer 2004

机译：PlaNet联盟：欧洲植物数据库网络在综合的生物知识资源中连接植物基因组数据
7. The SADID Evaluation Datasets for Low-Resource Spoken Language Machine Translation of Arabic Dialects [O] . Wael Abid 2020

机译：阿拉伯语方言的低资源口语语言语言翻译的侦探评估数据集

EDRAK: Entity-Centric Data Resource for Arabic Knowledge

摘要

著录项

相似文献

相关主题

期刊订阅