首页> 外文会议>International Conference on Intelligence Science and Big Data Engineering >An Efficient String Searching Algorithm Based on Occurrence Frequency and Pattern of Vowels and Consonants in a Pattern
【24h】

An Efficient String Searching Algorithm Based on Occurrence Frequency and Pattern of Vowels and Consonants in a Pattern

机译:一种基于模式的发生频率和辅音模式的高效字符串搜索算法

获取原文

摘要

Information and communication technologies enable people to access to various documentations and information. Huge documents and information in the Internet or storage disks have made search time more important. Especially as the volume size and the number of documents on the Internet increase, string search times and costs increase have become big burden to search service. But, most of string searching algorithms have not consider lexical structures nor vowels' occurrence frequency. Formal documents (articles, news, novels, etc.) have important characteristic that is 'well-formed written' English. And words of formal documents have 'limit number of words and alphabets' that are listed in a dictionary. The 'limit number of words and alphabets' has predictable occurrence probability in real world's documentations. We try to use the alphabet occurrence probability as first search condition. We analyze all the words in the dictionaries (dictionary of free dictionary project, scrabblehelper - Revision 20, Winedit dictionary) and calculate each alphabet occurrence probability of repeated vowels, repeated consonants, not-repeated vowels and not-repeated consonants. In this paper, we define and propose the search rules and string searching algorithm, based on occurrence frequency and patterns of vowels and consonants. We use only the occurrence patterns and repeated positions of vowel and consonant in a text. Therefore, in the real world, proposed string searching algorithm (OFRP algorithm) is based on occurrence frequency and repetition pattern of vowels and consonants and is usefully and effectively applied to string search service and web search engine.
机译:信息和通信技术使人们能够访问各种文档和信息。互联网或存储磁盘中的巨大文档和信息使搜索时间更加重要。特别是随着互联网上的卷大小和文件数量增加,字符串搜索次数和成本增加已成为搜索服务的重负。但是,大多数字符串搜索算法都没有考虑词汇结构,也不考虑元音的发生频率。正式文件(文章,新闻,小说等)具有“良好的书面”英语的重要特征。和形式文件的单词有字典中列出的“限制单词和字母”。 “极限单词和字母表”具有现实世界文档中的可预测发生概率。我们尝试将字母表发生概率作为第一个搜索条件使用。我们分析词典中的所有单词(免费字典项目,克拉布布尔 - 修订20,winedit字典)并计算重复元音,重复辅音,未重复元音和未反复辅音的每个字母表发生概率。在本文中,我们基于发生频率和元音和辅音模式来定义和提出搜索规则和串搜索算法。我们仅在文本中仅使用元音和辅音的重复姿势和重复位置。因此,在现实世界中,所提出的字符串搜索算法(OFRP算法)基于变形和辅音的发生频率和重复模式,并且有效地应用于串搜索服务和Web搜索引擎。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号