首页> 外文会议>International Conference on Applied Machine Learning >Ranking of Odia Text Document Relevant to User Query Using Vector Space Model
【24h】

Ranking of Odia Text Document Relevant to User Query Using Vector Space Model

机译:使用矢量空间模型对与用户查询相关的Odia文本文档进行排名

获取原文

摘要

In this digital world, there is no lack of digitize text, image, audio and video for any subject but extracting the conceptual information relevant material is very difficult and achieved by different information retrieval(IR) technologies. Boolean Matching Model, Probabilistic Model, Extended Boolean Model, Fuzzy Set Model and Vector Space Model are some of the widely used models for information retrieval. Among these, vector space model (VSM) is one of the classical and widely used model which works based on the mathematical concepts of linear algebra. In VSM, queries and documents are the vectors in the highly multidimensional space and terms are used as dimensions to construct the index key to represent the documents. In this paper, we consider different approaches of vector space model for information retrieval and compare the results to find a neat understanding of term count model, classical vector space model and normalized vector space model. All the three models works based on the concept of term frequency (tf) and inverse document frequency (idf). Here the vector space retrieval technology we have applied on odia text to get an overview how VSM works on regional language like Odia. Odia (officially changed the name from Oriya to Odia in November, 2011) is one of the language mainly spoken by the people live in Odisha, the south east state of India. This Model can also be used for text summarization and data mining purpose.
机译:在这个数字世界中,不缺少任何主题的数字化文本,图像,音频和视频,但是提取与概念性信息有关的材料非常困难,并且可以通过不同的信息检索(IR)技术来实现。布尔匹配模型,概率模型,扩展布尔模型,模糊集模型和向量空间模型是信息检索中广泛使用的模型。其中,向量空间模型(VSM)是基于线性代数的数学概念工作的经典且广泛使用的模型之一。在VSM中,查询和文档是高度多维空间中的向量,术语被用作维来构建表示文档的索引键。在本文中,我们考虑了向量空间模型用于信息检索的不同方法,并对结果进行比较以找到对术语计数模型,经典向量空间模型和归一化向量空间模型的清晰理解。这三个模型都基于术语频率(tf)和反向文档频率(idf)的概念工作。在这里,我们将矢量空间检索技术应用于Odia文本,以概述VSM如何在Odia等区域语言上工作。 Odia(2011年11月从Oriya正式更名为Odia)是生活在印度东南部奥里萨邦的人们主要使用的语言之一。该模型还可以用于文本摘要和数据挖掘目的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号