...
首页> 外文期刊>Expert Systems with Application >Learning to classify short text from scientific documents using topic models with various types of knowledge
【24h】

Learning to classify short text from scientific documents using topic models with various types of knowledge

机译:学习使用具有各种知识类型的主题模型对科学文献中的短文本进行分类

获取原文
获取原文并翻译 | 示例
           

摘要

Classification of short text is challenging due to data sparseness, which is a typical characteristic of short text. In this paper, we propose methods for enhancing features using topic models, which make short text seem less sparse and more topic-oriented for classification. We exploited topic model analysis based on Latent Dirichlet Allocation for enriched datasets, and then we presented new methods for enhancing features by combining external texts from topic models that make documents more effective for classification. In experiments, we utilized the title contents of scientific articles as short text documents, and then enriched these documents using topic models from various types of universal datasets for classification in order to show that our approach performs efficiently.
机译:短文本的分类由于数据稀疏而具有挑战性,这是短文本的典型特征。在本文中,我们提出了使用主题模型增强特征的方法,这些方法可使短文本显得稀疏,并且更易于主题分类。我们利用基于潜在Dirichlet分配的主题模型分析来获取丰富的数据集,然后我们提出了通过组合主题模型中的外部文本来增强功能的新方法,从而使文档更有效地进行分类。在实验中,我们将科学文章的标题内容用作短文本文档,然后使用来自各种类型的通用数据集的主题模型对这些文档进行丰富,以进行分类,以表明我们的方法有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号