首页> 外文会议>International Conference on Smart Computing amp;amp;amp;amp;amp;amp; Communications >Named Entity Recognition in Bengali Text Using Merged Hidden Markov Model and Rule Base Approach
【24h】

Named Entity Recognition in Bengali Text Using Merged Hidden Markov Model and Rule Base Approach

机译:命名在孟加拉文本中的实体识别使用合并的隐藏马尔可夫模型和规则基础方法

获取原文

摘要

Named Entity Recognition (NER) is the subtask of Natural Language Processing (NLP) which tries to achieve human level on a specific domain (e.g. newspaper) to identify named entities. It seeks to locate and classify named entities (Person Name, Location, Organization names etc.), which is the most vital step of Information Extraction (IE). In many cases Machine Learning (ML) is mostly used to perform NER. Apart from that, another method is applied which is known as Rule Base approach. This paper presents a method which is using both ML and Rule Base approach together for NER basing on Bengali language. Mainly the rule based approach has been merged with ML. For ML Hidden Markov Model (HMM) and for rule base approach Regular Expression has been used. A Named Entity (NE) tagged corpus has been developed by using Bengali newspaper, which consists of 10k words that has been manually annotated with seven tags. This paper concludes with experimental results which shows two distinctive ways of our proposed model.
机译:命名实体识别(ner)是自然语言处理(NLP)的子任务,它试图在特定领域(例如报纸)上实现人类水平以确定命名实体。它试图找到并分类命名实体(人名,位置,组织名称等),这是信息提取的最重要步骤(即)。在许多情况下,机器学习(ml)主要用于执行ner。除此之外,应用了另一种方法,称为规则基础方法。本文介绍了一种方法,它使用ML和规则基础方法在孟加拉语言上以ner基于ner。主要是基于规则的方法已与ML合并。对于ML隐藏的Markov模型(HMM)和规则基础方法已经使用了正则表达式。通过使用孟加拉报纸开发了一个命名的实体(NE)标记的语料库,该报纸由已手动注释的10k单词组成,这些标签由七个标签手动注释。本文结束了实验结果,显示了我们所提出的模型的两种独特方式。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号