Boosting the Quality of Approximate String Matching by Synonyms

Lu Jiaheng; Lin Chunbin; Wang Wei; Li Chen; Xiao Xiaokui

首页> 外文期刊>ACM transactions on database systems >Boosting the Quality of Approximate String Matching by Synonyms

【24h】

Boosting the Quality of Approximate String Matching by Synonyms

机译：通过同义词提高近似字符串匹配的质量

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

A string-similarity measure quantifies the similarity between two text strings for approximate string matching or comparison. For example, the strings "Sam" and "Samuel" can be considered to be similar. Most existing work that computes the similarity of two strings only considers syntactic similarities, for example, number of common words or q-grams. While this is indeed an indicator of similarity, there are many important cases where syntactically-different strings can represent the same real-world object. For example, "Bill" is a short form of "William," and "Database Management Systems" can be abbreviated as "DBMS." Given a collection of predefined synonyms, the purpose of this article is to explore such existing knowledge to effectively evaluate the similarity between two strings and efficiently perform similarity searches and joins, thereby boosting the quality of approximate string matching.

机译：字符串相似性度量可量化两个文本字符串之间的相似性，以进行近似字符串匹配或比较。例如，字符串“ Sam”和“ Samuel”可以被认为是相似的。现有的大多数计算两个字符串的相似度的工作都只考虑语法相似性，例如，常见单词或q-gram的数量。尽管这确实表明了相似性，但在许多重要的情况下，语法上不同的字符串可以表示相同的真实世界对象。例如，“帐单”是“威廉”的缩写，“数据库管理系统”可以缩写为“ DBMS”。给定一组预定义的同义词，本文的目的是探索这些现有知识，以有效评估两个字符串之间的相似性，并有效地执行相似性搜索和连接，从而提高近似字符串匹配的质量。

著录项

来源
《ACM transactions on database systems》 |2015年第3期|15.1-15.42|共42页
作者
Lu Jiaheng; Lin Chunbin; Wang Wei; Li Chen; Xiao Xiaokui;
展开▼
作者单位

Renmin Univ China, Beijing, Peoples R China|Univ Helsinki, Dept Comp Sci, FI-00014 Helsinki, Finland;

Univ Calif San Diego, La Jolla, CA 92093 USA;

Univ New S Wales, Sch Engn & Comp Sci, Sydney, NSW 2052, Australia;

Univ Calif Irvine, Dept Comp Sci, Irvine, CA 92697 USA;

Nanyang Technol Univ, Sch Engn & Comp Sci, Singapore 639798, Singapore;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Algorithms; Experimentation; Performance; Theory; String similarity search; similarity join; semantic search;

机译：算法;实验;性能;理论;字符串相似度搜索;相似度联接;语义搜索;

相似文献

外文文献
中文文献
专利

1. New algorithms for fixed-length approximate string matching and approximate circular string matching under the Hamming distance [J] . Ho ThienLuan, Oh Seung-Rohk, Kim HyunJin Journal of supercomputing . 2018,第5期

机译：海明距离下定长近似字符串匹配和近似圆字符串匹配的新算法
2. Correction to: New algorithms for fixed-length approximate string matching and approximate circular string matching under the Hamming distance [J] . Ho ThienLuan, Oh Seung-Rohk, Kim HyunJin Journal of supercomputing . 2018,第5期

机译：更正为：在汉明距离下用于定长近似字符串匹配和近似圆形字符串匹配的新算法
3. Optimal implementations of the approximate string matching and the approximate discrete signal matching on the memory machine models [J] . Koji Nakano Parallel Algorithms and Applications . 2014,第1a2期

机译：内存机器模型上近似字符串匹配和近似离散信号匹配的最佳实现
4. Approximate String Matching for Iris Recognition by Means of Boosted Gabor Wavelets [C] . Climent J., Blanco J.D., Hexsel R.A. 23rd SIBGRAPI Conference on Graphics, Patterns and Images . 2010

机译：通过增强Gabor小波进行虹膜识别的近似字符串匹配
5. Dual-stage boosting systems: Modeling of configurations, matching and boost control options. [D] . Lee, Byungchan. 2009

机译：双级升压系统：配置，匹配和升压控制选项的建模。
6. libFLASM: a software library for fixed-length approximate string matching [O] . Lorraine A. K. Ayad, Solon P. Pissis, Ahmad Retha 2016

机译：libFLASM：用于固定长度的近似字符串匹配的软件库
7. Boosting the Quality of Approximate String Matching by Synonyms [O] . Lu, Jiaheng, Lin, Chunbin, Wang, Wei, 2015

机译：通过同义词提高近似字符串匹配的质量

Boosting the Quality of Approximate String Matching by Synonyms

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅