...
首页> 外文期刊>Cybernetics, IEEE Transactions on >From Principal Curves to Granular Principal Curves
【24h】

From Principal Curves to Granular Principal Curves

机译:从主曲线到粒状主曲线

获取原文
获取原文并翻译 | 示例
           

摘要

Principal curves arising as an essential construct in dimensionality reduction and data analysis have recently attracted much attention from theoretical as well as practical perspective. In many real-world situations, however, the efficiency of existing principal curves algorithms is often arguable, in particular when dealing with massive data owing to the associated high computational complexity. A certain drawback of these constructs stems from the fact that in several applications principal curves cannot fully capture some essential problem-oriented facets of the data dealing with width, aspect ratio, width change, etc. Information granulation is a powerful tool supporting processing and interpreting massive data. In this paper, invoking the underlying ideas of information granulation, we propose a granular principal curves approach, regarded as an extension of principal curves algorithms, to improve efficiency and achieve a sound accuracy–efficiency tradeoff. First, large amounts of numerical data are granulated into $C$ intervals—information granules developed with the use of fuzzy C-means clustering and the two criteria of information granulation, which significantly reduce the amount of data to be processed at the later phase of the overall design. Granular principal curves are then constructed by determining the upper and the lower bounds of the interval data. Finally, we develop an objective function using the criteria of information confidence and specificity to evaluate the granular output formed by the principal curves. We also optimize the granular principal curves by adjusting the level of information granularity (the number of clusters), which is realized with the aid of the particle swarm optimization. A number of numeric studies completed for synthetic and real-world datasets provide a useful quantifiable insight into the effectiveness of the proposed algorithm.
机译:作为降维和数据分析中必不可少的构造而出现的主曲线最近在理论和实践上都引起了很多关注。但是,在许多实际情况下,现有主曲线算法的效率通常是有争议的,特别是在处理大量数据时,由于相关的高计算复杂性,尤其如此。这些构造的某些缺点源于以下事实:在一些应用中,主曲线无法完全捕获涉及宽度,纵横比,宽度变化等的数据的某些基本的面向问题的方面。信息细化是支持处理和解释的强大工具。海量数据。在本文中,根据信息粒度的基本思想,我们提出了一种细化的主曲线方法(被视为主曲线算法的一种扩展),以提高效率并实现合理的精度-效率权衡。首先,将大量数值数据细化为$ C $间隔-使用模糊C均值聚类和两个信息细化条件开发的信息颗粒,这显着减少了后期的处理数据量。整体设计。然后通过确定间隔数据的上限和下限来构造颗粒状主曲线。最后,我们使用信息置信度和特异性标准开发目标函数,以评估由主曲线形成的粒度输出。我们还通过调整信息粒度(簇数)的级别来优化粒状主曲线,这是借助粒子群优化实现的。针对合成和真实数据集完成的大量数值研究为所提出算法的有效性提供了有用的量化见解。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号