首页> 外文会议>Workshop on Predicting and Improving Text Readability for Target Reader Populations 2013 >Modeling Comma Placement in Chinese Text for Better Readability using Linguistic Features and Gaze Information
【24h】

Modeling Comma Placement in Chinese Text for Better Readability using Linguistic Features and Gaze Information

机译:使用语言功能和注视信息对中文文本中的逗号位置进行建模以提高可读性

获取原文
获取原文并翻译 | 示例

摘要

Comma placements in Chinese text are relatively arbitrary although there are some syntactic guidelines for them. In this research, we attempt to improve the readability of text by optimizing comma placements through integration of linguistic features of text and gaze features of readers. We design a comma predictor for general Chinese text based on conditional random field models with linguistic features. After that, we build a rule-based filter for categorizing commas in text according to their contribution to readability based on the analysis of gazes of people reading text with and without commas. The experimental results show that our predictor reproduces the comma distribution in the Penn Chinese Treebank with 78.41 in F_1-score and commas chosen by our filter smoothen certain gaze behaviors.
机译:中文文本中的逗号位置相对任意,尽管有一些语法指导。在这项研究中,我们尝试通过整合文本的语言特征和读者的注视特征来优化逗号位置,从而提高文本的可读性。我们基于具有语言特征的条件随机字段模型,为普通中文文本设计了一个逗号预测器。此后,我们基于对有逗号和无逗号的文本阅读者的视线进行分析,建立了一个基于规则的过滤器,用于根据文本对逗号的可读性进行分类。实验结果表明,我们的预测器以F_1分数重现了Penn Chinese Treebank中的逗号分布,该分数为78.41,并且由我们的过滤器选择的逗号使某些注视行为变得平滑。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号