首页> 外文会议>9th International conference on language resources and evaluation >Heuristic Hyper-minimization of Finite State Lexicons
【24h】

Heuristic Hyper-minimization of Finite State Lexicons

机译:有限状态词汇的启发式超最小化

获取原文

摘要

Flag diacritics, which are special multi-character symbols executed at runtime, enable optimising finite-state networks by combining identical sub-graphs of its transition graph. Traditionally, the feature has required linguists to devise the optimisations to the graph by hand alongside the morphological description. In this paper, we present a novel method for discovering flag positions in morphological lexicons automatically, based on the morpheme structure implicit in the language description. With this approach, we have gained significant decrease in the size of finite-state networks while maintaining reasonable application speed. The algorithm can be applied to any language description, where the biggest achievements are expected in large and complex morphologies. The most noticeable reduction in size we got with a morphological transducer for Greenlandic, whose original size is on average about 15 times larger than other morphologies. With the presented hyper-minimization method, the transducer is reduced to 10,1% of the original size, with lookup speed decreased only by 9,5%.
机译:标记变音符号是在运行时执行的特殊多字符符号,可通过组合其过渡图的相同子图来优化有限状态网络。传统上,该功能要求语言学家与形态描述一起手工设计图形的优化。在本文中,我们提出了一种新的方法,该方法基于语言描述中隐含的词素结构,自动发现词法词典中的标记位置。通过这种方法,我们在保持合理的应用速度的同时,极大地减少了有限状态网络的规模。该算法可应用于任何语言描述,在大型和复杂的形态学中有望取得最大的成就。我们使用格陵兰语的形态转换器获得了最显着的尺寸减小,其原始尺寸平均比其他形态大15倍左右。使用提出的超最小化方法,换能器减小到原始尺寸的10.1%,而查找速度仅减小9.5%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号