首页> 外文期刊>ACM transactions on Asian language information processing >A Fast and Compact Language Model Implementation Using Double-Array Structures
【24h】

A Fast and Compact Language Model Implementation Using Double-Array Structures

机译:使用双数组结构的快速紧凑的语言模型实现

获取原文
获取原文并翻译 | 示例
       

摘要

The language model is a widely used component in fields such as natural language processing, automatic speech recognition, and optical character recognition. In particular, statistical machine translation uses language models, and the translation speed and the amount of memory required are greatly affected by the performance of the language model implementation. We propose a fast and compact implementation of re-gram language models that increases query speed and reduces memory usage by using a double-array structure, which is known to be a fast and compact trie data structure. We propose two types of implementation: one for backward suffix trees and the other for reverse tries. The data structure is optimized for space efficiency by embedding model parameters into otherwise unused spaces in the double-array structure. We show that the reverse trie version of our method is among the smallest state-of-the-art implementations in terms of model size with almost the same speed as the implementation that performs fastest on perplexity calculation tasks. Similarly, we achieve faster decoding while keeping compact model sizes, and we confirm that our method can utilize the efficiency of the double-array structure to achieve a balance between speed and size on translation tasks.
机译:语言模型是自然语言处理,自动语音识别和光学字符识别等领域中广泛使用的组件。特别地,统计机器翻译使用语言模型,并且语言模型实现的性能极大地影响了翻译速度和所需的内存量。我们提出了一种快速而紧凑的re-gram语言模型实现,它通过使用双数组结构来提高查询速度并减少了内存使用,该结构是一种快速而紧凑的trie数据结构。我们提出两种实现方式:一种用于后缀树,另一种用于反向尝试。通过将模型参数嵌入到双数组结构中其他未使用的空间中,可以优化数据结构的空间效率。我们显示,就模型大小而言,我们方法的反向特里版本是最小的最新实现方式之一,其速度几乎与在困惑度计算任务上执行最快的实现速度相同。同样,我们在保持紧凑的模型尺寸的同时实现了更快的解码,并且我们确认了我们的方法可以利用双数组结构的效率来实现翻译任务的速度和大小之间的平衡。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号