...
首页> 外文期刊>Audio, Speech, and Language Processing, IEEE/ACM Transactions on >Personalizing Recurrent-Neural-Network-Based Language Model by Social Network
【24h】

Personalizing Recurrent-Neural-Network-Based Language Model by Social Network

机译:通过社交网络个性化基于递归神经网络的语言模型

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

With the popularity of mobile devices, personalized speech recognizers have become more attainable and are highly attractive. Since each mobile device is used primarily by a single user, it is possible to have a personalized recognizer that well matches the characteristics of the individual user. Although acoustic model personalization has been investigated for decades, much less work has been reported on personalizing language models, presumably because of the difficulties in collecting sufficient personalized corpora. In this paper, we propose a general framework for personalizing recurrent-neural-network-based language models (RNNLMs) using data collected from social networks, including the posts of many individual users and friend relationships among the users. Two major directions for this are model-based and feature-based RNNLM personalization. In model-based RNNLM personalization, the RNNLM parameters are fine-tuned to an individual user's wording patterns by incorporating social texts posted by the target user and his or her friends. For the feature-based approach, the RNNLM model parameters are fixed across users, but the RNNLM input features are instead augmented with personalized information. Both approaches not only drastically reduce the model perplexity, but also moderately reduce word error rates in n-best rescoring tests.
机译:随着移动设备的普及,个性化语音识别器变得越来越容易获得并且具有很高的吸引力。由于每个移动设备主要由单个用户使用,因此有可能拥有一个完全匹配单个用户特征的个性化识别器。尽管对声学模型的个性化进行了数十年的研究,但有关语言模型的个性化的报道却很少,这可能是因为收集足够的个性化语料库很困难。在本文中,我们提出了一个通用框架,用于使用从社交网络收集的数据来个性化基于递归神经网络的语言模型(RNNLM),包括许多个人用户的帖子以及用户之间的朋友关系。基于此的两个主要方向是基于模型的RNNLM和基于特征的RNNLM个性化。在基于模型的RNNLM个性化设置中,通过合并目标用户及其朋友发布的社交文本,可以将RNNLM参数微调为单个用户的措辞模式。对于基于功能的方法,RNNLM模型参数在整个用户中是固定的,但RNNLM输入功能却通过个性化信息进行了扩充。两种方法不仅可以大大降低模型的复杂性,而且可以适度降低n次最佳评分测试中的字错误率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号