首页> 外文会议>Workshop on Chinese Lexical Semantics >Research on Chinese Animal Words Extraction Based on Children's Literature Corpus
【24h】

Research on Chinese Animal Words Extraction Based on Children's Literature Corpus

机译:基于儿童文学语料库的中国动物单词提取研究

获取原文

摘要

Categorized and graded vocabularies are an important aspect of children's graded reading. Taking animal words from the Thesaurus of Modern Chinese as the seed words, this paper studies a method of extracting animal words from the children's literature corpus and attempts to construct a word sequencing model. The method used is to match the results of automatic word segmentation with the seed words. There are 786 animal nouns extracted from the corpus, with an increasing rate of 39.36% compared to the 564 seed words, and there are 780 derivative animal words. The animal word sequencing model is based on word-work-popularity and word-writer-popularity, which resolves the problem of having an unbalanced number of characters and writer's works.
机译:分类和分级词汇表是儿童评分阅读的一个重要方面。本文研究了现代汉语中的动物词语作为种子词,研究了一种从儿童文学语料库中提取动物词语的方法,并试图构建单词测序模型。使用的方法是将自动词分割结果与种子单词匹配。与564个种子单词相比,有786个动物名词从语料库中提取,增加了39.36%的速度,并且有780个衍生动物词汇。动物单词测序模型基于Word-Works-Worder和Word-Writer-versity,它解决了具有不平衡数量的字符和作家的作品的问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号