Language Modelling of Constraints for Text Clustering

机译：文本聚类的约束语言建模

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Constrained clustering is a recently presented family of semi-supervised learning algorithms. These methods use domain information to impose constraints over the clustering output. The way in which those constraints (typically pair-wise constraints between documents) are introduced is by designing new clustering algorithms that enforce the accomplishment of the constraints. In this paper we present an alternative approach for constrained clustering where, instead of defining new algorithms or objective functions, the constraints are introduced modifying the document representation by means of their language modelling. More precisely the constraints are modelled using the well-known Relevance Models successfully used in other retrieval tasks such as pseudo-relevance feedback. To the best of our knowledge this is the first attempt to try such approach. The results show that the presented approach is an effective method for constrained clustering even improving the results of existing constrained clustering algorithms.

机译：约束聚类是最近提出的半监督学习算法家族。这些方法使用域信息对群集输出施加约束。引入这些约束（通常是文档之间的成对约束）的方式是通过设计新的聚类算法来强制实现约束。在本文中，我们提出了一种约束聚类的替代方法，该方法引入了约束，而不是定义新的算法或目标函数，而是通过其语言建模来修改文档表示形式。更准确地说，使用在其他检索任务（例如伪相关反馈）中成功使用的众所周知的相关模型对约束进行建模。据我们所知，这是尝试这种方法的首次尝试。结果表明，所提出的方法是一种有效的约束聚类方法，甚至可以改善现有约束聚类算法的效果。

著录项

来源
《Advances in information retrieval.》|2012年|p.352-363|共12页
会议地点 Barcelona(ES);Barcelona(ES)
作者
Javier Parapar; Alvaro Barreiro;
展开▼
作者单位

IRLab, Computer Science Department University of A Coruna, Spain;

IRLab, Computer Science Department University of A Coruna, Spain;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类信息处理（信息加工）;信息处理（信息加工）;
关键词

相似文献

外文文献
中文文献
专利

1. A Text Clustering Approach of Chinese News Based on Neural Network Language Model [J] . Zhaoxin Fan, Shuoying Chen, Li Zha, International journal of parallel programming . 2016,第1期

机译：基于神经网络语言模型的中文新闻文本聚类方法
2. SEMANTIC TEXT CLUSTERING USING ENHANCED VECTOR SPACE MODEL USING NEPALI LANGUAGE [J] . Chiranjibi Sitaula Computer Sciences and Telecommunications . 2012,第4期

机译：使用NEPALI语言的增强矢量空间模型进行语义文本聚类
3. Language Model Adaptation Using Machine-Translated Text for Resource-Deficient Languages [J] . ArnarThor Jensson, Koji Iwano, Sadaoki Furui EURASIP journal on audio, speech, and music processing . 2009,第1期

机译：使用机器翻译的文本对资源不足的语言进行语言模型自适应
4. Language Modeling with Linguistic Cluster Constraints [C] . Frederick Jelinek, Jia Cui International Conference on Text, Speech and Dialogue . 2007

机译：语言建模与语言簇约束
5. Visual modeling of XML constraints based on a new extensible constraint markup language. [D] . Hu, Jingkun. 2003

机译：基于新的可扩展约束标记语言的XML约束的可视化建模。
6. Document Sublanguage Clustering to Detect Medical Specialty in Cross-institutional Clinical Texts [O] . Kristina Doing-Harris, Olga Patterson, Sean Igo, -1

机译：通过文档亚语言聚类来检测跨机构临床文本中的医学专业
7. Language Modelling of Constraints for Text Clustering [O] . Javier Parapar, Álvaro Barreiro 2012

机译：文本群集约束的语言建模

Language Modelling of Constraints for Text Clustering

摘要

著录项

相似文献

相关主题

期刊订阅