首页> 外文期刊>Computer Standards & Interfaces >Arabic string searching in the context of character code standards and orthographic variations
【24h】

Arabic string searching in the context of character code standards and orthographic variations

机译:在字符代码标准和正字法变体中搜索阿拉伯字符串

获取原文
获取原文并翻译 | 示例
       

摘要

In this paper, the problems of searching in Arabic text for finding a given pattern is considered from a practical perspective. Orthographic and character encoding variations are discussed. Four of the well known string searching algorithms have been modified to handle Arabic text and their performance has been examined and analyzed using vocalized and unvocalized Arabic texts of varying length. All algorithms could be modified to support the handling of diacritics, but only three could support spelling variant checking. The empirical results show that the Boyer--Moore--Horspool algorithm provides the best overall running time performance in dealing with Arabic text in terms of both the average number of comparisons and the average actual machine execution time. Most of the increase rate in the complexity of BMH is attributed to spelling variant checking, in comparison with the efficiency of handling diacritics.
机译:在本文中,从实用的角度考虑了在阿拉伯文本中搜索以找到给定模式的问题。讨论了拼字和字符编码的变化形式。已对四种著名的字符串搜索算法进行了修改,以处理阿拉伯语文本,并且使用可变长度的发声和未发声的阿拉伯语文本检查和分析了它们的性能。可以修改所有算法以支持变音符号的处理,但是只有三种可以支持拼写检查。实验结果表明,就平均比较次数和平均实际机器执行时间而言,在处理阿拉伯文本时,Boyer-Moore-Horspool算法提供了最佳的总体运行时间性能。与处理变音符号的效率相比,BMH复杂性的大部分增长率归因于拼写检查。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号