首页> 外文期刊>Journal of management information systems >Accommodating Individual Preferences in the Categorization of Documents: A Personalized Clustering Approach
【24h】

Accommodating Individual Preferences in the Categorization of Documents: A Personalized Clustering Approach

机译:在文件分类中适应个人喜好:一种个性化的聚类方法

获取原文
获取原文并翻译 | 示例
           

摘要

As electronic commerce and knowledge economy environments proliferate, both individuals and organizations increasingly generate and consume large amounts of online information, typically available as textual documents. To manage this ever-increasing volume of documents, individuals and organizations frequently organize their documents into categories that facilitate document management and subsequent access and browsing. Document clustering is an intentional act that should reflect individual preferences with regard to the semantic coherency and relevant categorization of documents. Hence, effective document clustering must consider individual preferences and needs to support personalization in document categorization. In this paper, we present an automatic document-clustering approach that incorporates an individual's partial clustering as preferential information. Combining two document representation methods, feature refinement and feature weighting, with two clustering methods, precluster-based hierarchical agglomerative clustering (HAC) and atomic-based HAC, we establish four personalized document-clustering techniques. Using a traditional content-based document-clustering technique as a performance benchmark, we find that the proposed personalized document-clustering techniques improve clustering effectiveness, as measured by cluster precision and cluster recall.
机译:随着电子商务和知识经济环境的激增,个人和组织都越来越多地生成和使用大量在线信息,这些信息通常以文本文档形式提供。为了管理不断增长的文档量,个人和组织经常将其文档分类为便于文档管理以及随后的访问和浏览的类别。文档聚类是一种有意的行为,应反映个人在文档的语义一致性和相关分类方面的偏好。因此,有效的文档聚类必须考虑个人的喜好以及在文档分类中支持个性化的需求。在本文中,我们提出了一种自动文档聚类方法,该方法结合了个人的部分聚类作为优先信息。结合两种文档表示方法,特征细化和特征权重,以及两种聚类方法:基于聚簇的分层聚类聚类(HAC)和基于原子的HAC,我们建立了四种个性化的文档聚类技术。使用传统的基于内容的文档聚类技术作为性能基准,我们发现所提出的个性化文档聚类技术可提高聚类效果,这是通过聚类精度和聚类召回来衡量的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号