首页> 外文会议>International Conference on Communications >Aspects Revealing the Orthography and Punctuation Impact in Printed Romanian: A Literary Corpus Based Study
【24h】

Aspects Revealing the Orthography and Punctuation Impact in Printed Romanian: A Literary Corpus Based Study

机译:揭示印刷罗马尼亚语的正射法和标点符号的方面:基于文学的研究

获取原文

摘要

The paper is part of a broader study on the impact of orthography and punctuation in the printed Romanian language model. The experimental study is made on a corpus of 49 books (novels and short stories) totaling over 6 million words. The paper sets up 3 areas of priority interest for the natural language (NL) user along the Zipf's Law graph for the analysis of significant words in the language (covering over 70% of the occurrences in the corpus.) The analysis is continued to the author subcorpus. The final words of the sentence / complex sentence are also considered.
机译:本文是对印刷罗马尼亚语言模型中的正射法和标点符号的影响的更广泛研究的一部分。实验研究是对49本书(小说和短篇小说)的语料库,总计超过600万字。本文为自然语言(NL)用户沿着ZIPF的法律图建立了3个优先级的优先兴趣领域,用于分析语言中的重要词语(涵盖语料库中的出现超过70 %。)继续进行分析作者Subcorpus。还考虑了句子/复杂句子的最后一个词。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号