An Efficient String Searching Algorithm Based on Occurrence Frequency and Pattern of Vowels and Consonants in a Pattern

机译：一种基于模式的发生频率和辅音模式的高效字符串搜索算法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Information and communication technologies enable people to access to various documentations and information. Huge documents and information in the Internet or storage disks have made search time more important. Especially as the volume size and the number of documents on the Internet increase, string search times and costs increase have become big burden to search service. But, most of string searching algorithms have not consider lexical structures nor vowels' occurrence frequency. Formal documents (articles, news, novels, etc.) have important characteristic that is 'well-formed written' English. And words of formal documents have 'limit number of words and alphabets' that are listed in a dictionary. The 'limit number of words and alphabets' has predictable occurrence probability in real world's documentations. We try to use the alphabet occurrence probability as first search condition. We analyze all the words in the dictionaries (dictionary of free dictionary project, scrabblehelper - Revision 20, Winedit dictionary) and calculate each alphabet occurrence probability of repeated vowels, repeated consonants, not-repeated vowels and not-repeated consonants. In this paper, we define and propose the search rules and string searching algorithm, based on occurrence frequency and patterns of vowels and consonants. We use only the occurrence patterns and repeated positions of vowel and consonant in a text. Therefore, in the real world, proposed string searching algorithm (OFRP algorithm) is based on occurrence frequency and repetition pattern of vowels and consonants and is usefully and effectively applied to string search service and web search engine.

机译：信息和通信技术使人们能够访问各种文档和信息。互联网或存储磁盘中的巨大文档和信息使搜索时间更加重要。特别是随着互联网上的卷大小和文件数量增加，字符串搜索次数和成本增加已成为搜索服务的重负。但是，大多数字符串搜索算法都没有考虑词汇结构，也不考虑元音的发生频率。正式文件（文章，新闻，小说等）具有“良好的书面”英语的重要特征。和形式文件的单词有字典中列出的“限制单词和字母”。 “极限单词和字母表”具有现实世界文档中的可预测发生概率。我们尝试将字母表发生概率作为第一个搜索条件使用。我们分析词典中的所有单词（免费字典项目，克拉布布尔 - 修订20，winedit字典）并计算重复元音，重复辅音，未重复元音和未反复辅音的每个字母表发生概率。在本文中，我们基于发生频率和元音和辅音模式来定义和提出搜索规则和串搜索算法。我们仅在文本中仅使用元音和辅音的重复姿势和重复位置。因此，在现实世界中，所提出的字符串搜索算法（OFRP算法）基于变形和辅音的发生频率和重复模式，并且有效地应用于串搜索服务和Web搜索引擎。

著录项

来源
《International Conference on Intelligence Science and Big Data Engineering》|2015年||共10页
会议地点
作者
Kwang Sik Chung; Soo Young Kim; Heon Chang Yu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词
String search; Vowel and consonant-based string search; Occurrence frequency of vowels; Occurrence frequency of consonants; Repetition pattern of vowels; Repetition pattern of consonants;

机译：字符串搜索;基于元音和辅音的字符串搜索;变形的发生频率;辅音的发生频率;重复的元音模式;重复辅音模式;

相似文献

外文文献
中文文献
专利

1. Efficient bit-parallel multi-patterns approximate string matching algorithms [J] . Rajesh Prasad, Anuj Kumar Sharma, Alok Singh, Scientific Research and Essays . 2011,第4期

机译：高效的位并行多模式近似字符串匹配算法
2. Occurrences Algorithm for String Searching Based on Brute-force Algorithm | Science Publications [J] . Ababneh Mohammad, Oqeili Saleh, Rawan A. Abdeen Journal of computer sciences . 2006,第1期

机译：蛮力算法的字符串搜索出现算法科学出版物
3. Optimizing of large-number-patterns string matching algorithms based on definite-state automata [J] . CHEN Xun-xun, FANG Bin-xing Journal of Harbin Institute of Technology . 2007,第2期

机译：基于确定状态自动机的大模式字符串匹配算法优化
4. An Efficient String Searching Algorithm Based on Occurrence Frequency and Pattern of Vowels and Consonants in a Pattern [C] . Kwang Sik Chung, Soo Young Kim, Heon Chang Yu International Conference on intelligent science and big data engineering . 2015

机译：基于模式中元音和辅音出现频率和模式的高效字符串搜索算法
5. A gesture-based account of consonant patterns in Korean. [D] . Park, Ok-Sook. 2002

机译：基于手势的韩语辅音模式说明。
6. Teaching Ordinal Patterns to a Computer: Efficient Encoding Algorithms Based on the Lehmer Code [O] . Sebastian Berger, Andrii Kravtsiv, Gerhard Schneider, 2019

机译：教导序数模式到计算机：基于Lehmer代码的高效编码算法
7. Consonant–vowel co-occurrence patterns in Mandarin-learning infants [O] . Li-Mei Chen, Raymond D. Kent 2004

机译：普通话学习婴儿的辅音 - 元音共现模式
8. Efficient bit string implementation of a database cross-field association system (with an application to protein sequence patterns) [R] . Guigo, R, Vazquez, I, Smith, T F 1992

机译：数据库跨域关联系统的高效位串实现（应用于蛋白质序列模式）

An Efficient String Searching Algorithm Based on Occurrence Frequency and Pattern of Vowels and Consonants in a Pattern

摘要

著录项

相似文献

相关主题

期刊订阅