首页> 外文学位 >Weighting document genre in Enterprise Search.
【24h】

Weighting document genre in Enterprise Search.

机译:在企业搜索中加权文档类型。

获取原文
获取原文并翻译 | 示例

摘要

The creation of an Enterprise Search system involves many challenges that are not present in Web search. Searching a corporate collection is influenced both by the structure of the data present in the collection and by the policies of the corporation. These structures and policies may differ from corporation to corporation, and from collection to collection.;The work presented in this thesis adapts the Okapi BM25 scoring function to weight term frequency based on the relevance of a document genre to a work task. The method utilizes two user-provided resources, relevance judgments and clickthrough data, to estimate a realistic weight for each task-genre relationship. Using this approach, the method matches the purpose of each user search request with the purpose of each document. Therefore, the proper documents are returned to the user and her/his need can be fulfilled.;The method has been incorporated into a prototype search engine, X-site, currently deployed on a corporate intranet. X-Site is a contextual search engine that uses the relationships between work tasks and document genres to improve search precision for software engineers. The system provides a customized and user-controlled means of refining search results to suit the task context of a user. Through X-Site, each employee can make a single search request and has access to documents from the Internet, a corporate intranet, and Lotus Notes databases.;In particular, an Enterprise Search system must take a document's genre into account. Examples of document genre within a corporate collection might include FAQs, white papers, technical reports, memos, emails and chat messages. Depending on an individual's current work task, it might be appropriate to give one genre a greater weight than another during the processing of a search request. Moreover, this weighting may change as the individual's work task changes.
机译:企业搜索系统的创建涉及许多Web搜索中不存在的挑战。搜索公司集合受集合中存在的数据结构和公司政策的影响。这些结构和策略可能因公司而异,并且因馆藏而异。;本文中的工作根据文件类型与工作任务的相关性,将Okapi BM25评分功能调整为加权词频。该方法利用两个用户提供的资源,相关性判断和点击数据,为每个任务-类型关系估计现实的权重。使用这种方法,该方法使每个用户搜索请求的目的与每个文档的目的相匹配。因此,适当的文档将返回给用户,并且可以满足她/他的需要。该方法已被合并到当前部署在公司Intranet上的原型搜索引擎X-site中。 X-Site是上下文搜索引擎,它使用工作任务和文档类型之间的关系来提高软件工程师的搜索精度。该系统提供了一种定制的用户控制方式,可以优化搜索结果以适合用户的任务上下文。通过X-Site,每个员工都可以发出一个搜索请求,并且可以访问Internet,公司Intranet和Lotus Notes数据库中的文档。特别是,企业搜索系统必须考虑文档的类型。公司馆藏中文件类型的示例可能包括常见问题解答,白皮书,技术报告,备忘录,电子邮件和聊天消息。根据个人当前的工作任务,在搜索请求的处理过程中,赋予一种类型比另一种类型更大的权重可能是适当的。此外,随着个人工作任务的改变,这种加权也可能改变。

著录项

  • 作者

    Yeung, Peter Chun Kai.;

  • 作者单位

    University of Waterloo (Canada).;

  • 授予单位 University of Waterloo (Canada).;
  • 学科 Mathematics.;Computer Science.
  • 学位 M.Math.
  • 年度 2007
  • 页码 104 p.
  • 总页数 104
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号