【24h】

Extension of Zipf's Law to Words and Phrases

机译:Zipf定律的扩展到单词和短语

获取原文
获取原文并翻译 | 示例

摘要

Zipf's law states that the frequency of word tokens in a large corpus of natural language is inversely proportional to the rank. The law is investigated for two languages English and Mandarin and for n-gram word phrases as well as for single words. The law for single words is shown to be valid only for high frequency words. However, when single word and n-gram phrases are combined together in one list and put in order of frequency the combined list follows Zipf s law accurately for all words and phrases, down to the lowest frequencies in both languages. The Zipf curves for the two languages are then almost identical.
机译:Zipf定律指出,大自然语言语料库中单词标记的出现频率与等级成反比。对英语和普通话两种语言,n-gram单词短语以及单个单词的法律进行了调查。单个单词的定律仅对高频单词有效。但是,当将单个单词和n-gram短语组合到一个列表中并按频率顺序排列时,对于所有单词和短语,组合列表准确地遵循Zipf定律,直到两种语言中的最低频率。两种语言的Zipf曲线几乎相同。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号