首页> 外文会议>IEEE/ACM International Conference on Mining Software Repositories >500+ Times Faster than Deep Learning: (A Case Study Exploring Faster Methods for Text Mining StackOverflow)

500+ Times Faster than Deep Learning: (A Case Study Exploring Faster Methods for Text Mining StackOverflow)




Deep learning methods are useful for high-dimensional data and are becoming widely used in many areas of software engineering. Deep learners utilizes extensive computational power and can take a long time to train- making it difficult to widely validate and repeat and improve their results. Further, they are not the best solution in all domains. For example, recent results show that for finding related Stack Overflow posts, a tuned SVM performs similarly to a deep learner, but is significantly faster to train.This paper extends that recent result by clustering the dataset, then tuning every learners within each cluster. This approach is over 500 times faster than deep learning (and over 900 times faster if we use all the cores on a standard laptop computer). Significantly, this faster approach generates classifiers nearly as good (within 2% F1 Score) as the much slower deep learning method. Hence we recommend this faster methods since it is much easier to reproduce and utilizes far fewer CPU resources. More generally, we recommend that before researchers release research results, that they compare their supposedly sophisticated methods against simpler alternatives(e.g applying simpler learners to build local models).
机译:深度学习方法对于高维数据很有用,并且已在软件工程的许多领域中得到广泛使用。深度学习者利用广泛的计算能力,并且可能需要很长时间来训练,因此难以广泛验证,重复和改善他们的结果。此外,它们并不是所有领域的最佳解决方案。例如,最近的结果表明,为了查找相关的Stack Overflow帖子,调整后的SVM的性能类似于深度学习者,但训练速度明显更快。本文通过对数据集进行聚类,然后调整每个聚类中的每个学习者来扩展最近的结果。这种方法比深度学习快500倍以上(如果使用标准便携式计算机上的所有内核,则快900倍以上)。值得注意的是,这种更快的方法所生成的分类器几乎与慢得多的深度学习方法一样好(在F1分数中不到2%)。因此,我们建议使用这种更快的方法,因为它更容易重现并且占用的CPU资源少得多。更笼统地说,我们建议在研究人员发布研究结果之前,将他们所谓的复杂方法与更简单的替代方法进行比较(例如,使用更简单的学习者来构建本地模型)。



  • 外文文献
  • 中文文献
  • 专利


京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号