首页> 外文会议>IEEE International Professional Communication Conference >Reading Time Prediction Model on Chinese Technical Documentation
【24h】

Reading Time Prediction Model on Chinese Technical Documentation

机译:中文技术文献阅读时间预测模型

获取原文

摘要

This paper was presented at the Invited Panel session “Technical Communication in China”. There has been various research on the reading time and legibility of online texts with people’s tendency to online materials. Text-related attributes like font size or letterspacing are commonly used variables in this field. The objective of this study is to investigate the influential factors on the reading time of Chinese technical documentation, and to build a Decision Tree model to predict its reading time. In the experiment, log data including information of over a million user visits from a cloud service provider’s website are collected. User’s visit time, stay time, visit step, visit device and many other data fields are recorded in a user session. In addition to user behavioral data from log files, data metrics concerning technical documentation itself are also collected. For all documents used in the experiment, their word counts, image counts, link counts and section counts are scraped using web crawlers. The linear correlation analysis is applied in order to explore the correlations between variables for predictions. The results show that a 75 percent accuracy is achieved using the Decision Tree model.
机译:该论文在“中国技术交流”特邀小组会议上发表。随着人们倾向于使用在线材料,对在线文本的阅读时间和易读性进行了各种研究。与文本相关的属性(如字体大小或字母间距)是此字段中常用的变量。本研究的目的是调查影响中文技术文献阅读时间的因素,并建立决策树模型以预测其阅读时间。在实验中,收集了日志数据,其中包括来自云服务提供商网站的超过一百万次用户访问的信息。用户的访问时间,停留时间,访问步骤,访问设备和许多其他数据字段都记录在用户会话中。除了来自日志文件的用户行为数据外,还收集有关技术文档本身的数据度量标准。对于实验中使用的所有文档,其字数,图像数,链接数和部分数均使用网络爬虫进行了抓取。应用线性相关性分析是为了探索变量之间的相关性以进行预测。结果表明,使用决策树模型可以达到75%的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号