首页> 外文会议>International Joint Conference on Computer Science and Software Engineering >Classification of Astronomical Objects in the Galaxy M81 using Machine Learning Techniques II. An Application of Clustering in Data Pre-processing
【24h】

Classification of Astronomical Objects in the Galaxy M81 using Machine Learning Techniques II. An Application of Clustering in Data Pre-processing

机译:使用机器学习技术II的Galaxy M81中天文对象的分类。 在数据预处理中群集的应用

获取原文

摘要

Identifying objects with a certain class in the current data in astronomy are challenging. In this study, we explored the methods to identify globular cluster candidates from a pool of astronomical objects in the galaxy M81. First, we developed a method to automatically cross-match the data. This process was done by manually overlayed the imaging data in the previous study. The process also eliminated the data points that only appear in only one or two filters, which indicates that they are artifacts. Next, we used the Expectation Maximization (EM) clustering technique to label the training dataset with classes and to reduce the use of humans in the preprocessing process. Our results show that the data can be clustered into 12 clusters, which can be grouped into 6 groups of astronomical objects with similar morphological structures. When using these 6 groups of data to build classification models, we found that the prediction accuracies have improved significantly. In the case of Random Forest, the accuracy has improved from 79.9% to 90.57% and from 67.1% to 91.59% for Multilayer Perceptron. Moreover, when using the model built from those data to analyze the unseen dataset, the results also show that the model can categorize the objects into classes with characteristics close to those in astronomy. However, this model still cannot fully separate globular clusters from foreground stars and background galaxies due to the similarities in their photometric properties.
机译:在天文学中的当前数据中识别具有某个类的对象是具有挑战性的。在这项研究中,我们探讨了从星系M81中的天文对象池中识别球状聚类候选的方法。首先,我们开发了一种自动交叉匹配数据的方法。该过程是通过手动覆盖前一项研究中的成像数据来完成的。该过程还消除了仅在一个或两个过滤器中出现的数据点,这表明它们是伪影。接下来,我们使用期望最大化(EM)聚类技术与类标记训练数据集,并在预处理过程中减少人类的使用。我们的结果表明,数据可以集聚集到12个集群中,可以将其分为6组,具有相似的形态结构。当使用这6组数据来构建分类模型时,我们发现预测精度显着提高。在随机森林的情况下,对于多层的感知,精度从79.9%提高到90.57%至90.57%,从67.1%到91.59%。此外,当使用从这些数据构建的模型来分析未知数据集时,结果还表明,该模型可以将物体分为与天文学中的特征接近的特性。然而,由于它们的光度特性中的相似性,该模型仍然不能完全将来自前景星和背景星系的球簇区分开。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号