首页> 外文会议>International conference on artificial intelligence and soft computing >Identifying Uncertain Galaxy Morphologies Using Unsupervised Learning
【24h】

Identifying Uncertain Galaxy Morphologies Using Unsupervised Learning

机译:使用无监督学习识别不确定的银河形态

获取原文

摘要

With the onset of massive cosmological data collection through mediums such as the Sloan Digital Sky Survey (SDSS), galaxy classification has been accomplished for the most part with the help of citizen science communities like Galaxy Zoo. However, an analysis of one of the Galaxy Zoo morphological classification data sets has shown that a significant majority of all classified galaxies are, in fact, labelled as "Uncertain" . This has driven us to conduct experiments with data obtained from the SDSS database using each galaxy's right ascension and declination values, together with the Galaxy Zoo morphology class label, and the k-means clustering algorithm. This paper identifies the best attributes for clustering using a heuristic approach and, accordingly, applies an unsupervised learning technique in order to improve the classification of galaxies labelled as "Uncertain" and increase the overall accuracies of such data clustering processes. Through this heuristic approach, it is observed that the accuracy of classes-to-clusters evaluation, by selecting the best combination of attributes via information gain, is further improved by approximately 10-15%. An accuracy of 82.627% was also achieved after conducting various experiments on the galaxies labelled as "Uncertain" and replacing them back into the original data set. It is concluded that a vast majority of these galaxies are, in fact, of spiral morphology with a small subset potentially consisting of stars, elliptical galaxies or galaxies of other morphological variants.
机译:随着通过斯隆数字天空调查(SDSS)等媒介收集大量宇宙学数据的开始,银河系分类已在很大程度上得到了银河动物园等公民科学界的帮助。然而,对银河动物园形态学分类数据集之一的分析表明,实际上所有分类星系中的绝大部分都被标记为“不确定”。这驱使我们进行实验,使用从每个星系的右上角和下倾角值,以及Galaxy Zoo形态学类别标签和k-均值聚类算法从SDSS数据库获得的数据。本文使用启发式方法确定最佳的聚类属性,并因此应用无监督学习技术,以改善标记为“不确定”的星系分类并提高此类数据聚类过程的总体准确性。通过这种启发式方法,可以观察到通过信息增益选择属性的最佳组合,类到类评估的准确性进一步提高了约10-15%。在对标记为“不确定”的星系进行各种实验并将其替换回原始数据集之后,也获得了82.627%的准确度。结论是,这些星系中的绝大多数实际上都是螺旋形的,一小部分可能由恒星,椭圆形星系或其他形态变异的星系组成。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号