...
首页> 外文期刊>Knowledge and Data Engineering, IEEE Transactions on >Coclustering Multiple Heterogeneous Domains: Linear Combinations and Agreements
【24h】

Coclustering Multiple Heterogeneous Domains: Linear Combinations and Agreements

机译:聚集多个异构域:线性组合和协议

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

The high-order coclustering problem, i.e., the problem of simultaneously clustering heterogeneous types of domain, has become an active research area in the last few years, due to the notable impact it has on several application scenarios. This problem is generally faced by optimizing a weighted combination of functions measuring the quality of coclustering over each pair of domains, where weights are chosen based on the supposed reliability/relevance of their correlation. However, little knowledge is likely to be available, in practice, in order to set these weights in a definite and precise manner. And, more importantly, it might even be conceptually unclear whether to prefer a weighing scheme over others, in those cases where functions encode contrasting goals so that improving the quality for a pair of domains leads to a deterioration for other pairs. The aim of this paper is precisely to shed light on the impact of weighting schemes on techniques based on linear combinations of pairwise objective functions, and to define an approach that overcomes the above problems by looking for an agreementȁ4;intuitively, a kind of compromiseȁ4;among the various domains, thereby getting rid of the need to define an appropriate weighting scheme. Two algorithms performing coclustering on "star-structuredȁD; domains, based on linear combinations and agreements, respectively, have been designed within an information-theoretic framework. Results from a thorough experimentation, on both synthetic and real data, are discussed, in order to assess the effectiveness of the approaches and to get more insight into their actual behavior.
机译:由于对多个应用场景的显着影响,高阶共聚问题(即同时聚类异构类型的域的问题)在最近几年已成为一个活跃的研究领域。通常,通过优化测量在每对域上的共聚质量的函数的加权组合来面对该问题,其中,基于它们的相关性的假定可靠性/相关性来选择权重。但是,实际上很难以确定和精确的方式设置这些权重。而且,更重要的是,在某些功能编码相反目标的情况下,甚至在概念上甚至还不清楚是否更喜欢称重方案,从而提高一对域的质量会导致其他对的恶化。本文的目的恰恰是要阐明加权方案对基于成对目标函数线性组合的技术的影响,并定义一种通过寻求协议ȁ4来克服上述问题的方法;直觉上是一种折衷ȁ4;在各个领域之间,因此无需定义适当的加权方案。在信息理论框架内,设计了两种分别基于线性组合和一致性对“星型ȁD;域”进行聚簇的算法,并讨论了对合成和真实数据进行全面实验的结果,从而评估方法的有效性,以更深入地了解其实际行为。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号