首页> 外文会议>Annual meeting of the Association for Computational Linguistics >Fast Online Training with Frequency-Adaptive Learning Rates for Chinese Word Segmentation and New Word Detection
【24h】

Fast Online Training with Frequency-Adaptive Learning Rates for Chinese Word Segmentation and New Word Detection

机译:快速在线培训与跨文字分割和新词检测的频率自适应学习率

获取原文
获取外文期刊封面目录资料

摘要

We present a joint model for Chinese word segmentation and new word detection. We present high dimensional new features, including word-based features and enriched edge (label-transition) features, for the joint modeling. As we know, training a word segmentation system on large-scale datasets is already costly. In our case, adding high dimensional new features will further slow down the training speed. To solve this problem, we propose a new training method, adaptive online gradient descent based on feature frequency information, for very fast online training of the parameters, even given large-scale datasets with high dimensional features. Compared with existing training methods, our training method is an order magnitude faster in terms of training time, and can achieve equal or even higher accuracies. The proposed fast training method is a general purpose optimization method, and it is not limited in the specific task discussed in this paper.
机译:我们为中文词组和新词检测提供了一个联合模型。我们呈现高维的新功能,包括基于Word的功能和丰富的边缘(标签转换)功能,用于联合建模。如我们所知,在大型数据集上培训一个单词分段系统已经成本高昂。在我们的情况下,添加高维新功能将进一步减慢训练速度。为了解决这个问题,我们提出了一种新的训练方法,基于特征频率信息,用于基于特征频率信息的自适应在线梯度下降,用于非常快速地对参数的在线训练,甚至给出具有高维度的大规模数据集。与现有培训方法相比,我们的训练方法在训练时间方面的顺序更快,并且可以实现相同甚至更高的准确性。所提出的快速训练方法是一种通用优化方法,并且在本文中讨论的特定任务中不受限制。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号