首页> 外文OA文献 >Arabic Text Summarization Based on Latent Semantic Analysis to Enhance Arabic Documents Clustering
【2h】

Arabic Text Summarization Based on Latent Semantic Analysis to Enhance Arabic Documents Clustering

机译:基于潜在语义分析的阿拉伯文文本摘要,增强阿拉伯文档聚类

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Arabic Documents Clustering is an important task for obtaining good resultswith the traditional Information Retrieval (IR) systems especially with therapid growth of the number of online documents present in Arabic language.Documents clustering aim to automatically group similar documents in onecluster using different similarity/distance measures. This task is oftenaffected by the documents length, useful information on the documents is oftenaccompanied by a large amount of noise, and therefore it is necessary toeliminate this noise while keeping useful information to boost the performanceof Documents clustering. In this paper, we propose to evaluate the impact oftext summarization using the Latent Semantic Analysis Model on Arabic DocumentsClustering in order to solve problems cited above, using fivesimilarity/distance measures: Euclidean Distance, Cosine Similarity, JaccardCoefficient, Pearson Correlation Coefficient and Averaged Kullback-LeiblerDivergence, for two times: without and with stemming. Our experimental resultsindicate that our proposed approach effectively solves the problems of noisyinformation and documents length, and thus significantly improve the clusteringperformance.
机译:阿拉伯文文件集群是获得良好结果的重要任务,用于获得传统信息检索(IR)系统,特别是在阿拉伯语中存在的在线文档的数量的TheraPid的生长。群体聚类旨在使用不同的相似性/距离测量自动在OneCluster中进行类似的文件。此任务通常由文档长度长度,有关文档的有用信息通常由大量噪声扫描,因此必须在保留有用信息时促进文档聚类的有用信息时才占此噪声。在本文中,我们建议使用潜在语义分析模型对阿拉伯语文件集团的潜在语义分析模型来评估对其上面的问题的影响,使用Fivesimilariantity /距离措施:欧几里德距离,余弦相似性,JaccardCo Defiforty,Pearson相关系数和平均kullback-雷布勒多发,两次:没有和鼻塞。我们的实验结果indimicate,我们提出的方法有效解决了诺斯的信息和文献长度的问题,从而显着改善了聚类矿石信息。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号