首页> 外文期刊>ACM transactions on Asian and low-resource language information processing >Korean Part-of-speech Tagging Based on Morpheme Generation
【24h】

Korean Part-of-speech Tagging Based on Morpheme Generation

机译:基于语素生成的韩国言语标签

获取原文
获取原文并翻译 | 示例

摘要

Two major problems of Korean part-of-speech (POS) tagging are that the word-spacing unit is not mapped one-to-one to a POS tag and that morphemes should be recovered during POS tagging. Therefore, this article proposes a novel two-step Korean POS tagger that solves the problems. This tagger first generates a sequence of lemmatized and recovered morphemes that can be mapped one-to-one to a POS tag using an encoder-decoder architecture derived from a POS-tagged corpus. Then, the POS tag of each morpheme in the generated sequence is finally determined by a standard sequence labeling method. Since the knowledge for segmenting and recovering morphemes is extracted automatically from a POS-tagged corpus by an encoder-decoder architecture, the POS tagger is constructed without a dictionary nor handcrafted linguistic rules. The experimental results on a standard dataset show that the proposed method outperforms existing POS taggers with its state-of-the-art performance.
机译:韩语术语(POS)标记的两个主要问题是单词间隔单元未映射为一对一到POS标记,并且在POS标记期间应恢复该语素。因此,本文提出了一种新颖的两步韩国POS标签,解决了问题。此标记首先生成一系列lemmatized和恢复的语素,可以使用从POS标记的语料库派生的编码器解码器架构一对一地映射到POS标记。然后,最终通过标准序列标记方法确定所生成序列中的每个语素的POS标记。由于通过编码器 - 解码器架构自动从POS标记的语料库自动提取分段和恢复语素的知识,因此在没有字典中或手工制作语言规则的情况下构建POS标记器。标准数据集上的实验结果表明,该方法的现有POS标签具有其最先进的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号