首页> 外文会议>International conference on quantum interaction >Study of Engineered Features and Learning Features in Machine Learning - A Case Study in Document Classification
【24h】

Study of Engineered Features and Learning Features in Machine Learning - A Case Study in Document Classification

机译:机器学习中工程化特征和学习特征的研究-以文档分类为例

获取原文

摘要

Document classification is challenging due to handling of voluminous and highly non-linear data, generated exponentially in the era of digitization. Proper representation of documents increases efficiency and performance of classification, ultimate goal of retrieving information from large corpus. Deep neural network models learn features for document classification unlike the engineered feature based approaches where features are extracted or selected from the data. In the paper we investigate performance of different classifiers based on the features obtained using two approaches. We apply deep autoencoder for learning features while engineering features are extracted by exploiting semantic association within the terms of the documents. Experimentally it has been observed that learning feature based classification always perform better than the proposed engineering feature based classifiers.
机译:由于处理在数字化时代以指数方式产生的大量和高度非线性的数据,文档分类具有挑战性。正确表示文档可以提高分类的效率和性能,这是从大型语料库中检索信息的最终目标。深度神经网络模型学习用于文档分类的特征,这不同于从数据中提取或选择特征的基于工程特征的方法。在本文中,我们基于使用两种方法获得的特征来研究不同分类器的性能。我们将深度自动编码器应用于学习功能,而通过利用文档条款内的语义关联来提取工程功能。实验上已经观察到,基于学习特征的分类总是比提出的基于工程特征的分类器表现更好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号