首页> 外文学位 >Supporting multilingual Internet searching and browsing.
【24h】

Supporting multilingual Internet searching and browsing.

机译:支持多语言Internet搜索和浏览。

获取原文
获取原文并翻译 | 示例

摘要

The amount of non-English information has proliferated rapidly in recent years. The broad diversity of the multilingual content presents a substantial research challenge in the field of knowledge discovery and information retrieval. Therefore there is an increased interest in the development of multilingual systems to support information sharing across languages. The goal of this dissertation is to study how different techniques and algorithms could help in multilingual Internet searching and browsing through a series of case studies.; A system development research process was adopted as the methodology in this dissertation. In the first part of the dissertation, I discuss the development of CMedPort, a Chinese medical portal to serve the information seeking needs of Chinese users. A systematic evaluation has been conducted to study the effectiveness and efficiency of CMedPort in assisting human analysis. My experimental results show that CMedPort achieved significant improvement in searching and browsing performance compared to three benchmark regional search engines.; The second and third case studies aim to investigate effective and efficient techniques and algorithms that facilitate multilingual Web retrieval. An English-Chinese multilingual Web retrieval system in the business IT domain was developed and evaluated. It was then extended into five languages: English, Chinese, Japanese, German and Spanish. A dictionary-based approach was adopted in query translation. Corpus-based co-occurrence analysis, relevance feedback, and phrasal translation algorithms were used for disambiguation purposes. Evaluation results showed that the system's phrasal translation and co-occurrence disambiguation led to great improvement in performance. The last part of this dissertation studies proper name translation problem. Proper names are often out-of-vocabulary terms and are critical to multilingual Web retrieval. This study proposes a combined Hidden Markov Model and Web mining model to automatically generate proper name translations. The approach was evaluated on two language pairs: English-Arabic and English Chinese. My results are encouraging and show promise for using transliteration techniques to improve multilingual Web retrieval.; This dissertation has two main contributions. Firstly, it demonstrated how information retrieval, Web mining and artificial intelligence techniques can be used in a multilingual Web-based context. Secondly, it provided a set of tools that can facilitate users in their multilingual Web searching and browsing activities.
机译:近年来,非英语信息的数量迅速增加。多语言内容的广泛多样性在知识发现和信息检索领域提出了重大的研究挑战。因此,人们对开发多语言系统以支持跨语言共享信息的兴趣日益浓厚。本文的目的是通过一系列案例研究,研究不同的技术和算法如何帮助多语言互联网搜索和浏览。本文采用系统开发研究方法作为方法论。在论文的第一部分中,我讨论了CMedPort的发展,CMedPort是一个中文医疗门户网站,可以满足中国用户的信息搜索需求。已经进行了系统的评估,以研究CMedPort在协助人类分析中的有效性和效率。我的实验结果表明,与三个基准区域搜索引擎相比,CMedPort在搜索和浏览性能方面取得了显着改善。第二和第三个案例研究旨在研究促进多语言Web检索的有效技术和算法。开发并评估了商务IT领域的英汉多语言Web检索系统。然后将其扩展为五种语言:英语,中文,日语,德语和西班牙语。查询翻译中采用了基于字典的方法。基于语料库的共现分析,相关性反馈和短语翻译算法用于消除歧义。评估结果表明,该系统的短语翻译和共现歧义消除导致性能大大提高。本文的最后一部分研究了专有名词翻译问题。专有名称通常是不讲词汇的术语,对于多语言Web检索至关重要。这项研究提出了一种组合的隐马尔可夫模型和Web挖掘模型,以自动生成适当的名称翻译。该方法在两种语言对上进行了评估:英语-阿拉伯语和英语中文。我的结果令人鼓舞,并显示出使用音译技术改善多语言Web检索的希望。本论文有两个主要贡献。首先,它演示了如何在基于多语言的基于Web的上下文中使用信息检索,Web挖掘和人工智能技术。其次,它提供了一组工具,可以帮助用户进行多语言Web搜索和浏览活动。

著录项

  • 作者

    Zhou, Yilu.;

  • 作者单位

    The University of Arizona.;

  • 授予单位 The University of Arizona.;
  • 学科 Information Science.
  • 学位 Ph.D.
  • 年度 2006
  • 页码 p.1960
  • 总页数 240
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 信息与知识传播;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号