首页> 外文学位 >Keyword-based approaches to improve Internet search.
【24h】

Keyword-based approaches to improve Internet search.

机译:基于关键字的方法来改善Internet搜索。

获取原文
获取原文并翻译 | 示例

摘要

Technology keeps on evolving and so must the science of information retrieval. This thesis presents keyword-based approaches to improve information retrieval from the Internet. Focused and unfocused queries to search engines are considered, and means of obtaining relevant documents are presented. For focused queries, techniques are provided to obtain a high precision score from the hit documents; these documents do contain the exact answers to the focused query, which is usually a question. User queries are subjected to ambiguity test to determine if it is ambiguous, and if it is so, provide direction so as the user's intended meaning is the one that is actually searched. The queries are modified to form a new clear and unambiguous. Query is sent to several search engines at the same time, and hit documents from each of these search engines are collated and merged. Hit documents to an ambiguous query are analyzed and ranked based on their actual relevance to the query. Term frequency is used, along with popularity score, to determine the total score of a relevant document. Every relevant hit document is classified based on its academic relevance. A few academic categories are considered---(1) Course Notes, (2) Frequently Asked Questions, (3) Research Paper, (4) Technical Report, (5) Thesis, (6) Tutorial, (7) Review, and (8) Research Paper/Technical Report. Once a search is done, a set of relevant documents is presented, along with each document's academic relevance category (if any).
机译:技术不断发展,信息检索科学也必须不断发展。本文提出了基于关键字的方法来改善从Internet检索信息。考虑了针对搜索引擎的重点和非重点查询,并介绍了获取相关文档的方法。对于重点查询,提供了从命中文档中获得高精度分数的技术。这些文档确实包含针对焦点查询的确切答案,这通常是一个问题。对用户查询进行歧义测试以确定其是否模棱两可,如果是,请提供方向,以使用户的预期含义是实际搜索到的含义。修改查询以形成新的清晰且明确的内容。同时将查询发送到多个搜索引擎,并整理和合并每个搜索引擎的匹配文档。对歧义查询的命中文档进行分析,并根据它们与查询的实际相关性对其进行排名。术语频率与受欢迎程度得分一起用于确定相关文档的总得分。每个相关的命中文档都根据其学术相关性进行分类。可以考虑以下几种学术类别:(1)课程注释,(2)常见问题,(3)研究论文,(4)技术报告,(5)论文,(6)教程,(7)复习,以及(8)研究论文/技术报告。搜索完成后,将显示一组相关文档以及每个文档的学术相关类别(如果有)。

著录项

  • 作者

    Hina, Manolo Dulva.;

  • 作者单位

    Concordia University (Canada).;

  • 授予单位 Concordia University (Canada).;
  • 学科 Computer Science.
  • 学位 M.Comp.Sc.
  • 年度 2003
  • 页码 191 p.
  • 总页数 191
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

  • 入库时间 2022-08-17 11:45:57

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号