首页> 外文会议>Chinese National Conference on Computational Linguistic >Recognition Method of Important Words in Korean Text based on Reinforcement Learning
【24h】

Recognition Method of Important Words in Korean Text based on Reinforcement Learning

机译:基于强化学习的韩文文本重要词汇识别方法

获取原文

摘要

The manual labeling work for constructing the Korean corpus is too time-consuming and laborious. It is difficult for low-minority languages to integrate resources. As a result, the research progress of Korean language information processing is slow. From the perspective of representation learning, reinforcement learning was combined with traditional deep learning methods. Based on the Korean text classification effect as a benchmark, and studied how to extract important Korean words in sentences. A structured model Information Distilled of Korean (IDK) was proposed. The model recognizes the words in Korean sentences and retains important words and deletes non-important words. Thereby transforming the reconstruction of the sentence into a sequential decision problem. So you can introduce the Policy Gradient method in reinforcement learning to solve the conversion problem. The results show that the model can identify the important words in Korean instead of manual annotation for representation learning. Furthermore, compared with traditional text classification methods, the model also improves the effect of Korean text classification.
机译:制造韩国语料库的手动标签工作太耗时和费力。低少数民族语言难以整合资源。因此,韩语信息处理的研究进展缓慢。从代表学习的角度来看,加固学习与传统的深度学习方法相结合。基于韩国文本分类效果作为基准,并研究了如何提取句子中的重要韩语单词。提出了蒸馏韩国(IDK)的结构化模型信息。该模型识别韩语句子中的单词并保留重要的单词并删除非重要词语。从而将句子的重建转变为序贯决策问题。因此,您可以在加强学习中介绍策略渐变方法来解决转换问题。结果表明,该模型可以识别韩语中的重要词语而不是用于表示学习的手动注释。此外,与传统文本分类方法相比,该模型还提高了韩文文本分类的影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号