首页> 外文会议>Australian Joint Conference on Artificial Intelligence; 20041204-06; Cairns(AU) >Combined Word-Spacing Method for Disambiguating Korean Texts
【24h】

Combined Word-Spacing Method for Disambiguating Korean Texts

机译:组合词间距法消除韩文歧义

获取原文
获取原文并翻译 | 示例

摘要

In this paper, we propose an automatic word-spacing method for a Korean text preprocessing system in resolving the problem of context-dependent word-spacing. The current method combines the stochastic-based method and partial parsing. First, the stochastic method splits an input sentence into a candidate-word sequence using word unigrams and syllable bigrams. Second, the system engages a partial parsing module based on the asymmetric relation between the candidate-words. The partial parsing module manages the governing relationship using words which are incorporated into the knowledge base as having a high-probability of spacing-error words. These elements serve as parsing trigger points based on their linguistic information, and they determine the parsing direction as well as the parsing scope. Combining the stochastic- and linguistic-based methods, the current automatic word-spacing system becomes robust against the problem of context-dependant word-spacing. An average 8.98% amelioration of the total error rate is obtained for inner and external data.
机译:在本文中,我们提出了一种针对韩文文本预处理系统的自动单词间距方法,以解决上下文相关的单词间距问题。当前的方法结合了基于随机的方法和部分解析。首先,随机方法使用单词会标和音节二元组将输入句子拆分为候选单词序列。其次,系统基于候选词之间的不对称关系启用部分解析模块。部分解析模块使用合并到知识库中的单词来管理管理关系,这些单词具有很高的间距错误单词概率。这些元素用作基于其语言信息的解析触发点,并且它们确定解析方向以及解析范围。结合基于随机和语言的方法,当前的自动单词间距系统变得健壮起来,可以解决上下文相关的单词间距问题。对于内部和外部数据,总错误率平均可得到8.98%的改善。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号