首页> 外文会议>Chinese lexical semantics workshop >Corpus-Based Statistics of Pre-Qin Chinese
【24h】

Corpus-Based Statistics of Pre-Qin Chinese

机译:基于语料库的先秦汉语统计

获取原文

摘要

The Pre-Qin Chinese plays a key role in the history of Chinese. However, for the lack of annotated corpus, the overview of Pre-Qin Chinese vocabulary is still not clear. This paper introduces the corpus of 25 Pre-Qin classical texts, which are under manual word segmentation and part-of-speech tagging. Then, the character and word frequencies are calculated based on the corpus. The character entropy, the syllables of words and the multiple part-of-speech words are also statistically analyzed.
机译:先秦汉语在中国历史上起着关键作用。但是,由于缺少注解的语料,先秦汉语词汇的概述仍然不清楚。本文介绍了25种先秦经典文献的语料库,这些文献均处于手动分词和词性标注之下。然后,基于语料库计算字符和单词频率。还对字符熵,单词的音节和多个词性词进行统计分析。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号