首页> 外文会议>Advances in information retrieval. >Language Modelling of Constraints for Text Clustering
【24h】

Language Modelling of Constraints for Text Clustering

机译:文本聚类的约束语言建模

获取原文
获取原文并翻译 | 示例

摘要

Constrained clustering is a recently presented family of semi-supervised learning algorithms. These methods use domain information to impose constraints over the clustering output. The way in which those constraints (typically pair-wise constraints between documents) are introduced is by designing new clustering algorithms that enforce the accomplishment of the constraints. In this paper we present an alternative approach for constrained clustering where, instead of defining new algorithms or objective functions, the constraints are introduced modifying the document representation by means of their language modelling. More precisely the constraints are modelled using the well-known Relevance Models successfully used in other retrieval tasks such as pseudo-relevance feedback. To the best of our knowledge this is the first attempt to try such approach. The results show that the presented approach is an effective method for constrained clustering even improving the results of existing constrained clustering algorithms.
机译:约束聚类是最近提出的半监督学习算法家族。这些方法使用域信息对群集输出施加约束。引入这些约束(通常是文档之间的成对约束)的方式是通过设计新的聚类算法来强制实现约束。在本文中,我们提出了一种约束聚类的替代方法,该方法引入了约束,而不是定义新的算法或目标函数,而是通过其语言建模来修改文档表示形式。更准确地说,使用在其他检索任务(例如伪相关反馈)中成功使用的众所周知的相关模型对约束进行建模。据我们所知,这是尝试这种方法的首次尝试。结果表明,所提出的方法是一种有效的约束聚类方法,甚至可以改善现有约束聚类算法的效果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号