Comparison Probabilistic Latent Semantic Indexing Model In Chinese Information Retrieval

机译：中文信息检索中的比较概率潜在语义索引模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

With the increasing of information on Internet, Web mining has been the focus of information retrieval. By a certain metric of similarity, Web clustering groups the similar Web documents. But the classical algorithms of clustering are aimless in searching the solution space and absent of semantic characters. In this paper, the probabilistic latent semantic indexing (PLSI) models which using word segmentation, two-grams and key words extraction separately are compared. As comparison, vector models using different Chinese information retrieval technologies are also tested in the same time. The experimental results show that the correct word segmentation can improve precision of information retrieval obviously to PLSI model. But it isn't effective to vector space model. And index based on key words extraction obtains highest accuracy rate to PLSI model.

机译：随着互联网信息的增加，网络挖掘一直是信息检索的焦点。通过相似性的某个度量，Web群集组类似的Web文档。但是群集的古典算法在寻找解决方案空间并且缺乏语义角色方面是漫无目的的。在本文中，比较了使用单词分割，两克和关键词分别提取的概率潜在语义索引（PLSI）模型。与比较一样，使用不同的中文信息检索技术的矢量模型也同时测试。实验结果表明，正确的词分割可以提高信息检索的精度，显然是PLSI模型。但它对矢量空间模型没有有效。基于关键词提取的索引获得了PLSI模型的最高精度率。

著录项

来源
《International Forum on Information Technology and Applications》|2009年||共4页
会议地点
作者

展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 G202-53;
关键词
Internet; data mining; indexing; information retrieval; Chinese information retrieval; PLSI model; Web clustering; Web documents; Web mining; key words extraction; probabilistic latent semantic indexing model; word segmentation; N-Grams retrieval; probabilistic latent semantic indexing;

机译：互联网;数据挖掘;索引;信息检索;中文信息检索;plsi模型;web群;网页挖掘;概率潜行语义索引模型;单词分割;n-grams检索;概率潜在语义索引;

相似文献

外文文献
中文文献
专利

1. A probabilistic model for Latent Semantic Indexing [J] . Ding CHQ Journal of the American Society for Information Science and Technology . 2005,第6期

机译：潜在语义索引的概率模型
2. Analysis of Vector Space Model, Latent Semantic Indexing and Formal Concept Analysis for Information Retrieval [J] . Cybernetics and information technologies: CIT . 2012,第1期

机译：向量空间模型分析，潜在语义索引和信息检索的形式概念分析
3. An Information Retrieval Model Based on Latent Semantic Indexing with Intelligent Preprocessing [J] . Ch. AswaniKumar, Ankush Gupta, Mahmooda Batool, Journal of information & knowledge management . 2005,第4期

机译：基于潜在语义索引和智能预处理的信息检索模型
4. Comparison Probabilistic Latent Semantic Indexing Model In Chinese Information Retrieval [C] . International Forum on Information Technology and Applications . 2009

机译：中文信息检索中的比较概率潜在语义索引模型
5. Content-Based Retrieval of Arabic Historical Manuscripts Using Latent Semantic Indexing [D] . Yahia, Mohammad Husni Najib 2011

机译：基于内容的潜在语义索引对阿拉伯历史手稿的基于内容的检索
6. Prediction of nuclear proteins using nuclear translocation signals proposed by probabilistic latent semantic indexing [O] . Emily Chia-Yu Su, Jia-Ming Chang, Cheng-Wei Cheng, 2012

机译：利用概率潜在语义索引提出的利用核易位信号预测核蛋白
7. A Probabilistic Model for Latent Semantic Indexing [O] . Chris H. Q. Ding 2005

机译：潜在语义索引的概率模型
8. Similarity-Based Probability Model for Latent Semantic Indexing [R] . Ding, C. H. Q. 1999

机译：基于相似度的潜在语义索引概率模型

Comparison Probabilistic Latent Semantic Indexing Model In Chinese Information Retrieval

摘要

著录项

相似文献

相关主题

期刊订阅