首页> 外文会议>Workshop of the Cross-Language Evaluation Forum >Unsupervised Acquiring of Morphological Paradigms from Tokenized Text
【24h】

Unsupervised Acquiring of Morphological Paradigms from Tokenized Text

机译:无监督从牌状文本获取形态范式

获取原文

摘要

This paper describes a rather simplistic method of unsupervised morphological analysis of words in an unknown language. All what is needed is a raw text corpus in the given language. The algorithm looks at words, identifies repeatedly occurring stems and suffixes, and constructs probable morphological paradigms. The paper also describes how this method has been applied to solve the Morpho Challenge 2007 task, and gives the Morpho Challenge results. Although quite simple, this approach outperformed, to our surprise, several others in most morpheme segmentation subcompetitions. We believe that there is enough room for improvements that can put the results even higher. Errors are discussed in the paper; together with suggested adjustments in future research.
机译:本文介绍了一种不知情的语言无人形态分析的相当简单的方法。所有所需要的是给定语言的原始语料库。该算法看着单词,识别反复发生的茎和后缀,构建可能的形态范式。本文还介绍了该方法如何应用于解决Morpho挑战2007任务,并给出了Morpho挑战结果。虽然很简单,这种方法表现出来,令我们惊讶的是,其他几个语素细分小组竞争。我们相信有足够的改进空间可以使结果更高。论文中讨论了错误;与未来研究的建议调整一起。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号