Improvement of Query Hit List Precision with a Document Clustering Technique

机译：利用文档聚类技术提高查询命中列表的准确性

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

We propose a new approach to improve query hit list precision in document information retrieval. We use the k-mean clustering technique to group returned hit list documents. The relevancy of each cluster is evaluated according to document relevancy scores in the clusters. The final relevancy score of each document is a combination of the relevancy score of cluster and individual document. To form clusters with features more related to the query, we use pseudo-feedback documents to construct a latent semantic index (LSI), which transforms all the documents in the hit list into LSI feature vectors. Feature vectors constructed with relevant features are input to the clustering algorithm. We show that LSI based on relevant documents can improve the hit list cluster coherence significantly, in the sense that clusters group query relevant and irrelevant documents separately. We also show that the improved cluster quality, which results to better separation between relevant and irrelevant documents, can be used to improve the precision of a query hit list significantly.

机译：我们提出了一种新的方法来提高文档信息检索中查询命中列表的准确性。我们使用k均值聚类技术对返回的命中列表文档进行分组。根据聚类中的文档相关性得分评估每个聚类的相关性。每个文档的最终相关性分数是群集和单个文档的相关性分数的组合。为了形成具有与查询更相关的特征的聚类，我们使用伪反馈文档来构造潜在语义索引（LSI），该属性将命中列表中的所有文档转换为LSI特征向量。具有相关特征的特征向量被输入到聚类算法。我们表明，在聚类组分别查询相关文档和不相关文档的意义上，基于相关文档的LSI可以显着提高命中列表聚类的一致性。我们还表明，改进的集群质量可导致更好地分离相关文档和不相关文档，可用于显着提高查询命中列表的精度。

著录项

来源
《Information Resources Management Association International Conference vol.1; 20040523-26; New Orleans,LA(US)》|2004年|P.193-196|共4页
会议地点 New OrleansLA(US)
作者
Ciya Liao; Shamim Alpha; Paul Dixon;
展开▼
作者单位

Oracle Corporation;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类信息资源及其管理;
关键词
入库时间 2022-08-26 13:56:37

相似文献

外文文献
中文文献
专利

1. Incremental models for query clustering and query-context aware document clustering [J] . Poonam Goyal, N. Mehala, Navneet Goyal International journal of knowledge and web intelligence . 2015,第2期

机译：用于查询聚类和查询上下文感知的文档聚类的增量模型
2. Discussion on ????????Improvements in the precision measurement of capacitance????????, ????????The design of an audio-frequency amplifier for high-precision voltage measurement????????, ????????The design and performance of high-precision audio-frequency current transformers???????? and ????????Techniques for the calibration of standard current transformers up to 20 kc/s???????? before the Measurement and Control Section, 10th January, 1961 [J] . Proceedings of the IEE - Part B: Electronic and Communication Engineering . 1961,第39期

机译：关于电容精度测量的改进的讨论高精度测量电压的音频放大器的设计高精度音频电流互感器的设计与性能以及用于标定高达20 kc / s的标准电流互感器的技术1961年1月10日，在测量与控制科之前
3. The authors' replies to the discussion on ????????Improvements in the precision measurement of capacitance????????, ????????The design of an audio-frequency amplifier for high-precision voltage measurement????????, ????????The design and performance of high-precision audio-frequency current transformers???????? and ????????Techniques for the calibration of standard current transformers up to 20 kc/s???????? [J] . Rayner G.H., Ford L.H., Harkness S., Proceedings of the IEE - Part B: Electronic and Communication Engineering . 1961,第39期

机译：作者对?????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????--------高精度电压测量高精度音频电流互感器的设计与性能以及用于标定高达20 kc / s的标准电流互感器的技术
4. Improvement of Query Hit List Precision with a Document Clustering Technique [C] . Ciya Liao, Shamim Alpha, Paul Dixon Information Resources Management Association International Conference . 2004

机译：用文档聚类技术改进查询命中列表精度
5. Query processing in spatial database systems: Declustering and clustering techniques. [D] . Ravada, Sivakumar. 1997

机译：空间数据库系统中的查询处理：聚类和聚类技术。
6. Hit series selection in noisy HTS data: clustering techniques statistical tests and data visualisations [O] . Christoph Müller, Daniel Ormsby, Isabella Feierberg, 2014

机译：嘈杂的HTS数据中的热门系列选择：聚类技术统计测试和数据可视化
7. A Method for Precision Improvement Based on Core Query Clusters and Term Proximity [O] . Kye-Hun Jang, Kyung-Soon Lee 2010

机译：基于核心查询群集的精度改进方法和术语临床附近
8. The Application of Clustering Techniques to Enlisted Force Management. [R] . Spencer, W. F. 1975

机译：聚类技术在入伍部队管理中的应用。

Improvement of Query Hit List Precision with a Document Clustering Technique

摘要

著录项

相似文献

相关主题

期刊订阅