...
首页> 外文期刊>Natural language engineering >Mining, analyzing, and modeling text written on mobile devices
【24h】

Mining, analyzing, and modeling text written on mobile devices

机译:在移动设备上写入的挖掘,分析和建模文本

获取原文
获取原文并翻译 | 示例

摘要

We present a method for mining the web for text entered on mobile devices. Using searching, crawling, and parsing techniques, we locate text that can be reliably identified as originating from 300 mobile devices. This includes 341,000 sentences written on iPhones alone. Our data enables a richer understanding of how users type "in the wild" on their mobile devices. We compare text and error characteristics of different device types, such as touchscreen phones, phones with physical keyboards, and tablet computers. Using our mined data, we train language models and evaluate these models on mobile test data. A mixture model trained on our mined data, Twitter, blog, and forum data predicts mobile text better than baseline models. Using phone and smartwatch typing data from 135 users, we demonstrate our models improve the recognition accuracy and word predictions of a state-of-the-art touchscreen virtual keyboard decoder. Finally, we make our language models and mined dataset available to other researchers.
机译:我们提出了一种用于在移动设备上输入的文本的Web的方法。使用搜索,爬网和解析技术,找到可以可靠地识别为源自300移动设备的文本。这包括341,000个在iPhone上写的句子。我们的数据使得更丰富地了解用户在其移动设备上的“野外”类型的理解。我们比较不同设备类型的文本和误差特征,例如触摸屏电话,带有物理键盘的手机和平板电脑。使用我们的挖掘数据,我们培训语言模型,并在移动测试数据上评估这些模型。在我们的挖掘数据,推特,博客和论坛数据上培训的混合模型预测了比基线模型更好的移动文本。使用电话和SmartWatch键入数据从135个用户使用,我们展示了我们的模型提高了最先进的触摸屏虚拟键盘解码器的识别准确性和单词预测。最后,我们将我们的语言模型和用于其他研究人员提供的语言模型和挖掘数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号