首页> 外文期刊>Expert Systems with Application >A TENGRAM method based part-of-speech tagging of multi-category words in Hindi language
【24h】

A TENGRAM method based part-of-speech tagging of multi-category words in Hindi language

机译:基于TENGRAM方法的印地语多类别词的词性标注

获取原文
获取原文并翻译 | 示例
       

摘要

In this paper, we have dealt on the problem of part-of-speech tagging of multi-category words which appear within the sentences of Hindi language. Firstly, a Hindi tagger is proposed which provides part-of-speech tags developed using grammar of Hindi language. For this purpose, Hindi Devanagari alphabets are used and their Hindi transliteration is done within the proposed tagger. Thereafter, a Rules' based TENGRAM method is described with an illustrative example, which guides to disambiguate multi-category words within sentences of Hindi corpus. The rules generated in TENGRAM are the result of computation of discernibility matrices, discernibility functions and reducts. These computations have been generated from decision tables which are based on theory of Rough sets. Basically, a discernibility matrix helps in cutting down indiscernible condition attributes; a discernibility function has rows corresponding to each column in the discernibility matrix which develops reducts; and the reducts provide a minimal subset of attributes which preserve indiscernibility relation of decision tables and hence they generate the decision rules.
机译:在本文中,我们讨论了出现在印地语句子中的多类别词的词性标注问题。首先,提出了印地语标记器,该标记器提供使用印地语语言语法开发的词性标签。为此,使用印地语梵文字母,并在建议的标记器中完成其印地语音译。此后,以说明性示例描述基于规则的TENGRAM方法,该方法指导消除印地语语料库句子内的多类别单词的歧义。 TENGRAM中生成的规则是可分辨矩阵,可分辨函数和归约的计算结果。这些计算是根据基于粗糙集理论的决策表生成的。基本上,可分辨矩阵有助于减少不可识别的条件属性;可分辨函数具有与可分辨矩阵中的可分辨矩阵中的每一列相对应的行;约简提供了属性的最小子集,这些子集保留了决策表的不可区分性关系,因此可以生成决策规则。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号