首页> 外文会议>Annual meeting of the Association for Computational Linguistics;ACL 2012 >Fast Online Training with Frequency-Adaptive Learning Rates for Chinese Word Segmentation and New Word Detection
【24h】

Fast Online Training with Frequency-Adaptive Learning Rates for Chinese Word Segmentation and New Word Detection

机译:具有频率自适应学习率的快速在线培训,用于中文分词和新词检测

获取原文

摘要

We present a joint model for Chinese word segmentation and new word detection. We present high dimensional new features, including word-based features and enriched edge (label-transition) features, for the joint modeling. As we know, training a word segmentation system on large-scale datasets is already costly. In our case, adding high dimensional new features will further slow down the training speed. To solve this problem, we propose a new training method, adaptive online gradient descent based on feature frequency information, for very fast online training of the parameters, even given large-scale datasets with high dimensional features. Compared with existing training methods, our training method is an order magnitude faster in terms of training time, and can achieve equal or even higher accuracies. The proposed fast training method is a general purpose optimization method, and it is not limited in the specific task discussed in this paper.
机译:我们提出了中文分词和新词检测的联合模型。我们为联合建模提供了高维的新功能,包括基于单词的功能和丰富的边缘(标签转换)功能。众所周知,在大规模数据集上训练单词分割系统已经很昂贵。在我们的案例中,添加高维度的新功能将进一步减慢训练速度。为了解决这个问题,我们提出了一种新的训练方法,即基于特征频率信息的自适应在线梯度下降,即使在给定具有高维特征的大规模数据集的情况下,也可以非常快速地在线进行参数训练。与现有的训练方法相比,我们的训练方法在训练时间上要快一个数量级,并且可以达到相同甚至更高的精度。提出的快速训练方法是一种通用的优化方法,并且不受本文讨论的特定任务的限制。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号