首页> 外文会议>International conference on advanced data mining and applications;ADMA 2011 >Topic Discovery and Topic-Driven Clustering for Audit Method Datasets
【24h】

Topic Discovery and Topic-Driven Clustering for Audit Method Datasets

机译:审计方法数据集的主题发现和主题驱动的聚类

获取原文

摘要

As the promotion of China's Golden Auditing Project and the fast growth of on-line auditing, there are thousands of new computer audit methods emerged every year to fulfill various needs of audit practices. How to organize these existing computer audit methods and use them intelligently have become a fundamental and challenging problem. In this paper, we propose to use topic-driven clustering methods to organize computer audit methods according to the system of computer audit methods that is issued by the National Audit Office of China. We also apply Latent Dirichlet allocation (LDA) analysis to audit method datasets at different levels of granularity. Our experimental results on social insurance computer audit methods show that the topic-driven clustering scheme with topics created by domain experts is the overall best scheme. It achieved an average purity of 0.862 across the datasets. Topics discovered by LDA were consistent with classes defined in the taxonomy for four out of five datasets, and they were effective when used in the topicdriven clustering scheme.
机译:随着中国黄金审计项目的推广和在线审计的快速发展,每年涌现出数以千计的新计算机审计方法,以满足审计实践的各种需求。如何组织这些现有的计算机审核方法并智能地使用它们已成为一个基本且具有挑战性的问题。在本文中,我们建议根据国家审计署发布的计算机审计方法体系,使用主题驱动的聚类方法来组织计算机审计方法。我们还将潜在Dirichlet分配(LDA)分析应用于不同粒度级别的审计方法数据集。我们在社会保险计算机审计方法上的实验结果表明,由领域专家创建的主题驱动的主题聚类方案是总体最佳方案。整个数据集的平均纯度为0.862。 LDA发现的主题与分类法中为五分之四的数据集定义的类保持一致,并且在主题驱动的聚类方案中使用时有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号