首页> 外文会议>International Conference on Genetic and Evolutionary Computing >Word Boundary Identification for Myanmar Text Using Conditional Random Fields
【24h】

Word Boundary Identification for Myanmar Text Using Conditional Random Fields

机译:使用条件随机字段的缅甸文本的字边界识别

获取原文
获取外文期刊封面目录资料

摘要

This paper examines the effectiveness of conditional random fields (CRFs) when used to identify Myanmar word boundaries within a supervised framework. Existing approaches are based on the method of maximum matching which appears to suffer from problems relating to the manner in which Myanmar words are composed. In our experiments, the CRF approach is compared against a baseline based on maximum matching using dictionaries from the Myanmar Language Commission Dictionary (word only) and a manually segmented subset of the BTEC1 corpus. The experimental results show that the CRF model is able to achieve considerably higher F-scores on the segmentation task than the baseline, even when the baseline is allowed to use words from the test data in its dictionary.
机译:本文审查了条件随机字段(CRF)的有效性,用于识别监督框架内的缅甸字界。现有方法基于最大匹配的方法,这似乎遭受了与缅甸单词组成的方式有关的问题。在我们的实验中,基于使用来自缅甸语言委员会字典(仅限Word)的词典和BTEC1语料库的手动分段子集的最大匹配,将CRF方法与基线进行比较。实验结果表明,即使允许基线在其字典中使用来自测试数据中的单词,CRF模型也能够在分割任务上实现比基线相当高的F分数。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号