首页> 外文会议>International Conference on Control, Communication Computing India >Text chunker for Malayalam using Memory-Based Learning
【24h】

Text chunker for Malayalam using Memory-Based Learning

机译:使用基于内存的学习的Malayalam文本块

获取原文

摘要

Text chunking consists of dividing a text into syntactically correlated parts of words. Given the words and their morphosyntactic class, a chunker will decide which words can be grouped as chunks. Malayalam is a free word order language and has relatively unrestricted phrase structures that make the problem of chunking quite challenging. This paper aims to develop a text chunker for Malayalam using Memory-Based Learning (MBL) approach. Memory-Based Learning is a machine learning methodology based on the idea that the direct reuse of examples using analogical reasoning is more suited for solving language processing problems than the application of rules extracted from those examples. The chunker was trained using the tool Memory-Based Tagger (MBT) with words and their POS tags as features. The chunker demonstrated an accuracy of 97.14%.
机译:文本块包括将文本划分为语法相关的单词。鉴于单词和它们的语气职业类,散货员将决定可以将哪些单词分组为块。 Malayalam是一种免费的单词秩序语言,并且具有相对不受限制的短语结构,使得大小是充满挑战性的问题。本文旨在使用基于内存的学习(MBL)方法为Malayalam开发一个文本块。基于内存的学习是一种机器学习方法,基于使用模拟推理的示例直接重复使用的想法更适合于求解语言处理问题,而不是从这些示例中提取的规则的应用。块训练使用基于工具内存的标签(MBT)用单词及其POS标记作为功能。块状物证明了97.14%的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号