...
首页> 外文期刊>International Journal of Web Engineering and Technology >Short text classification using feature enrichment from credible texts
【24h】

Short text classification using feature enrichment from credible texts

机译:使用来自可信文本的功能丰富的简短文本分类

获取原文
获取原文并翻译 | 示例

摘要

Classifying Tweet's contents can become a useful feature for other application tasks. However, such classification can be quite challenging due to the short length and sparsity of tweet contents. Although individual tweets have limited length, their contents delve into different topics. Therefore, due to such diverse contents, achieving good coverage of content features remains a challenge. We adopt the expansion of keywords technique in this research and study the enrichment of tweet contents using text from credible sources, such as news sites. For evaluation, we conduct experiments on two Twitter datasets using four standard classifiers. The proposed approach has enhanced the performance of the classification task, with improvements in accuracy ranging from +0.05% to +3.54% for both datasets. Experimental results positively demonstrate that the proposed feature enrichment method can overcome the sparseness limitation of short text with improved classification performances when running on various classifiers.
机译:对Tweet的内容进行分类可以成为其他应用程序任务的有用功能。然而,由于推特内容的短的长度和稀疏性,这种分类可能是非常具有挑战性的。虽然个别推文的长度有限,但它们的内容深入研究了不同的主题。因此,由于这种不同的内容,实现了内容特征的良好覆盖仍然是一个挑战。我们在本研究中采用关键字技术的扩展,并使用来自可信来源的文本(如新闻网站)的文本研究了推文内容的丰富。为了评估,我们使用四个标准分类器对两个Twitter数据集进行实验。该方法提高了分类任务的性能,精度的改进范围从+ 0.05%到+ 3.54%的数据集。实验结果积极表明,所提出的特征浓缩方法可以克服在各种分类器上运行时改进的分类性能的短文本的稀疏性限制。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号