Suffix Tree Based Approach for Chinese Information Retrieval

机译：基于后缀树的中文信息检索方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

With the widespread of the Internet, great research interests are being shown in Chinese language information retrieval in recent years. The absence of word boundaries in Chinese language makes Chinese information retrieval (IR) different to European IR. In order to apply traditional IR approaches to Chinese language, sentences have to be segmented into words first. Word segmentation is playing a key role in Chinese IR. As word segmentation is not straightforward and the results are sometime ambiguous, n-grams are used as an alternative. Several experimental studies have been conducted to compare words and n-grams[5, 6], word segmentation and its effect on information retrieval[3]. These studies show that using either words or n-grams leads to comparable performances. Higher word segmentation accuracy does not necessarily result in better retrieval performance. In this paper we propose a suffix tree based approach for Chinese information retrieval without word segementation.

机译：随着互联网的广泛，近年来，中文信息检索显示了巨大的研究兴趣。汉语语言中的缺点使得中文信息检索（IR）与欧洲IR不同。为了应用传统的中国语言方法，必须先被分段为单词。单词分割在中国IR中发挥着关键作用。由于文字分割并不直，并且结果是含糊不清的，则n-gram被用作替代方案。已经进行了几项实验研究以比较词语和n-grams [5,6]，词分割及其对信息检索的影响[3]。这些研究表明，使用任一词或n克导致可比性的性能。更高的单词分割精度不一定会导致更好的检索性能。在本文中，我们提出了一种基于后缀树的汉语信息检索方法，没有单词段。

著录项

来源
《International Conference on Intelligent Systems Design and Applications》|2008年||共5页
会议地点
作者
Huang Jin Hu; Powers David;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词
Suffix Tree;

机译：后缀树;

相似文献

外文文献
中文文献
专利

1. Sequence Comparison Alignment-Free Approach Based on Suffix Tree andL-WordsFrequency [J] . InêsSoares, AnaGoios, AntónioAmorim ScientificWorldJournal . 2012,第3期

机译：基于后缀树和单词频率的序列比较对齐方法
2. Suffix tree-based approach to detecting duplications in sequence diagrams [J] . Liu H., Niu Z., Ma Z., Software, IET . 2011,第4期

机译：基于后缀树的方法来检测序列图中的重复项
3. A Chinese Web Page Clustering Algorithm Based on the Suffix Tree [J] . YANG Jian-wu Wuhan University Journal of Natural Sciences . 2004,第5期

机译：基于后缀树的中文网页聚类算法
4. Suffix Tree Based Approach for Chinese Information Retrieval [C] . Huang Jin Hu, Powers David International Conference on Intelligent Systems Design and Applications . 2008

机译：基于后缀树的中文信息检索方法
5. Genetic sequence data retrieval and manipulation based on generalized suffix trees [D] . Bieganski, Paul 1995

机译：基于广义后缀树的遗传序列数据检索与处理
6. Sequence Comparison Alignment-Free Approach Based on Suffix Tree and L-Words Frequency [O] . Inês Soares, Ana Goios, António Amorim 2012

机译：基于后缀树和L词频率的序列比较无比对方法
7. Suffix Tree Based Approach for Chinese Information Retrieval [O] . Jin Hu Huang, David Powers 2015

机译：基于后缀树的中文信息检索方法
8. Retrieval by Shape Population: An Index Tree Approach [R] . Liu, L. , Sclaroff, S. 2001

机译：按形状种群检索：索引树方法

Suffix Tree Based Approach for Chinese Information Retrieval

摘要

著录项

相似文献

相关主题

期刊订阅