首页> 外文会议>International Conference on Information and Communication Technology >People Entity Recognition in Indonesian Quran Translation with Conditional Random Field Approach
【24h】

People Entity Recognition in Indonesian Quran Translation with Conditional Random Field Approach

机译:条件随机场法在印尼古兰经翻译中的人物实体识别

获取原文

摘要

The Quran is the primary source of law for Muslims. Quran has a total of 30 juzs, divided into 144 surahs and arranged of 6,236 verses and in the Quran discuss different topics and have many entities too, so someone sometimes has difficulty understanding the Quran. To make it easier to understand the Quran, we can get identification of essential entities in the Quran such as names of the people in the Quran. One way to do it is by extracting information on essential entities in Quran is with Named Entity Recognition (NER). NER automatically recognizes essential entities such as the people's names at the Quran. This paper builds a system for identifying the entity of the people in the Quran using CRF techniques with a multiple choice approach where the system will be introduced with a range of possibilities from the entity and adjusted to an input given to be able to detect entities. On system developing for identifying the entity of the people in Quran with Indonesian Quran translation dataset shows, this test produces an average performance of using the F1 generated at 0.77 for the use of training data as many as 36814 data from 954 verses in the Quran.
机译:古兰经是穆斯林法律的主要来源。古兰经总共有30个犹太人,分为144个古兰经,排列了6,236节经文,在《古兰经》中讨论了不同的主题,而且实体也很多,因此有时会有些人难以理解《古兰经》。为了使人们更容易理解《古兰经》,我们可以识别《古兰经》中的基本实体,例如《古兰经》中的人物姓名。一种实现方法是,通过使用命名实体识别(NER)来提取有关《古兰经》中基本实体的信息。 NER自动识别必要的实体,例如古兰经中的人们的名字。本文建立了一种使用CRF技术和多选方法来识别古兰经中人的实体的系统,该系统将引入该实体,并提供一系列从实体开始的可能性,并将其调整为能够检测到实体的输入。在使用印尼古兰经翻译数据集显示的用于识别古兰经中人的实体的系统开发中,该测试产生的平均性能是使用0.77生成的F1来训练来自古兰经中954节经文的多达36814个数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号