...
首页> 外文期刊>Journal of neurosurgical sciences >Outlier Detection for Water Supply Data Based on Joint Auto-Encoder
【24h】

Outlier Detection for Water Supply Data Based on Joint Auto-Encoder

机译:基于联合自动编码器的供水数据检测

获取原文
获取原文并翻译 | 示例
           

摘要

With the development of science and technology, the status of the water environment has received more and more attention. In this paper, we propose a deep learning model, named a Joint Auto-Encoder network, to solve the problem of outlier detection in water supply data. The Joint Auto-Encoder network first expands the size of training data and extracts the useful features from the input data, and then reconstructs the input data effectively into an output. The outliers are detected based on the network's reconstruction errors, with a larger reconstruction error indicating a higher rate to be an outlier. For water supply data, there are mainly two types of outliers: outliers with large values and those with values closed to zero. We set two separate thresholds, tau(1) and tau(2) , for the reconstruction errors to detect the two types of outliers respectively. The data samples with reconstruction errors exceeding the thresholds are voted to be outliers. The two thresholds can be calculated by the classification confusion matrix and the receiver operating characteristic (ROC) curve. We have also performed comparisons between the Joint Auto-Encoder and the vanilla Auto-Encoder in this paper on both the synthesis data set and the MNIST data set. As a result, our model has proved to outperform the vanilla Auto-Encoder and some other outlier detection approaches with the recall rate of 98.94 percent in water supply data.
机译:随着科学技术的发展,水环境的地位受到越来越多的关注。在本文中,我们提出了一个名为联合自动编码器网络的深度学习模型,解决了供水数据中的异常检测问题。联合自动编码器网络首先扩展训练数据的大小并从输入数据中提取有用的特征,然后将输入数据有效地重建为输出。基于网络的重建错误检测到异常值,具有更大的重建误差,表示更高的速率是异常值。对于供水数据,主要有两种类型的异常值:具有大值的异常值,值为零值。我们设置了两个单独的阈值,Tau(1)和Tau(2),用于重建错误分别检测两种类型的异常值。具有超过阈值的重建误差的数据样本被投票为异常值。两个阈值可以通过分类混淆矩阵和接收器操作特征(ROC)曲线来计算。在本文的合成数据集和Mnist数据集中,我们还在本文中进行了联合自动编码器和Vanilla自动编码器之间进行了比较。因此,我们的模型已被证明可以优于VANILLA自动编码器和其他一些异常值检测方法,其中回忆速率为98.94%的供水数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号