首页> 外文期刊>Advances in Computer Science and Information Technology: ACSIT >Facilitating Document Annotation for Efficient user Relevant Search
【24h】

Facilitating Document Annotation for Efficient user Relevant Search

机译:促进用于高效用户相关搜索的文档注释

获取原文
           

摘要

Document Annotation is the process of the addition of metadata information which is very important to fetch the information from the specific document. Due to the increasing sizes of data, it becomes cumbersome for anyone seeking for the information needs to provide assistance of automated approach to find what they are searching for. Clustering is a very important process to extract information from unstructured data, and enhance the process of grouping similar items together. Clustering also helps to discover hidden information and summarize a large amount of data into a small number of groups. We present an approach which maintain domain specific dictionary that helps to generate structured information by recognizing the documents. which contain target specific information. This information is going to be useful for following process of querying database. To the input documents text pre-processing was initially done to extract the terms from the sentences and K-mean algorithm is used for clustering the documents. The goal of the system is to maximize the number of relevant documents in the ranked list as well as making sure that they are high up in the ranked list.
机译:文档注释是添加元数据信息的过程,这对于从特定文档中获取信息非常重要。由于数据的尺寸越来越大,对于寻求信息的人来说需要提供自动化方法的援助来找到他们正在搜索的东西的繁重。群集是从非结构化数据中提取信息的非常重要的过程,并增强将类似物品分组的过程在一起。群集还有助于发现隐藏的信息,并将大量数据归纳为少量组。我们提出了一种维护域特定词典的方法,该词典有助于通过识别文档来生成结构化信息。包含目标特定信息。此信息将对查询数据库的以下过程有用。对于输入文档,文本预处理最初完成以从句子中提取术语,并且k平均算法用于群集文档。系统的目标是最大化排名列表中的相关文件的数量,并确保它们在排名列表中高效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号