...
首页> 外文期刊>Pattern Analysis and Applications >Applying a sectioned genetic algorithm to word segmentation
【24h】

Applying a sectioned genetic algorithm to word segmentation

机译:将分段遗传算法应用于分词

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

This article presents a novel approach for morphological analysis based on the concept of genetic algorithms (GAs). Morphological analysis is of critical importance in data mining and information retrieval systems because it leads to a more homogeneous representation of words. The system presented here makes minimal use of language specific information and is therefore more general than the rule-based techniques that have been proposed in literature. A number of heuristics are created and tested as evaluation functions; both general-purpose ones as well as heuristics specifically designed for the task, and decisions are made on the optimum models for the genetic operators suitable for the specific implementation. Finally the system addresses the problem of simultaneous processing of a great number of words without excessively increasing the execution time or deteriorating the segmentation quality of the final results. This is accomplished by the division of the individuals into sections, following the application of a group of masks, and the operation of the GA on these smaller sections instead of on the entire individual.
机译:本文提出了一种基于遗传算法(GA)概念的形态分析新方法。形态分析在数据挖掘和信息检索系统中至关重要,因为形态分析会导致单词的表示更加统一。这里介绍的系统很少使用特定于语言的信息,因此比文献中提出的基于规则的技术更通用。创建了许多启发式方法并将其作为评估功能进行测试;通用工具和针对该任务专门设计的启发式方法,并针对适合特定实现方式的遗传算子,针对最佳模型做出决策。最后,该系统解决了同时处理大量单词而又不过度增加执行时间或不降低最终结果的分割质量的问题。这是通过在应用一组蒙版之后将个体划分为多个部分来完成的,并且在这些较小的部分上而不是整个个体上执行GA的操作。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号