首页> 外文期刊>Research journal of applied science, engineering and technology >Similarity Measurements of Vector Space Model on Arabic Text
【24h】

Similarity Measurements of Vector Space Model on Arabic Text

机译:向量空间模型在阿拉伯文本上的相似性度量

获取原文
获取原文并翻译 | 示例
       

摘要

This study presented an effective retrieval model through appling a successful comparison between sets of measurment within Vector Space Model (VSM) and to proof that, we use two mechanism in inverted file, the first is word-oriented mechanism for indexing a text collection and the second is block-oriented mechanism, after removing the stop words. This study use 242 collection of arabic abstract and 60 building collection of arabic queries. During building an inverted file as index file, time and space factors are computed. And after running the system, recall and precision calculated to compare the retrieval efficiency of using inverted file. The VSM have many measurment: Cosine measure, Dice measure, Jaccard measure and Inner product similarity. The study achived an effective retrieval system through appled VSM with jaccard measure comparison with the other measurements, jaccard mesure obtain a good result particularly when using the arabic collection documents. The study also obtained a good result from block-oriented mechanism rather than word-oriented mechanism. As a conclusion, the best information retrival model for arabic documents is VSM with jaccard measure using block-oriented technique.
机译:这项研究通过应用向量空间模型(VSM)中的测量集之间的成功比较,提出了一个有效的检索模型,并证明了我们在倒排文件中使用了两种机制,第一种是用于索引文本集合的面向单词的机制,第二种是第二种是去除停用词后的面向块的机制。本研究使用242个阿拉伯文摘要集合和60个建筑物阿拉伯文查询集合。在将反向文件构建为索引文件期间,将计算时间和空间因子。并在运行系统后,计算召回率和精度,以比较使用倒排文件的检索效率。 VSM有许多测量方法:余弦测量,骰子测量,Jaccard测量和内部产品相似度。该研究通过苹果VSM与jaccard度量值与其他度量值的比较获得了有效的检索系统,尤其是在使用阿拉伯文收集文档时,jaccard度量值获得了良好的结果。该研究还从面向块的机制而不是面向单词的机制中获得了良好的结果。结论是,针对阿拉伯文档的最佳信息检索模型是采用面向块技术的Jaccard量度的VSM。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号