【24h】

Building a Web Thesaurus from Web Link Structure

机译:从Web链接结构构建Web同义词库

获取原文
获取原文并翻译 | 示例

摘要

Thesaurus has been widely used in many applications, including information retrieval, natural language processing, and question answering. In this paper, we propose a novel approach to automatically constructing a domain-specific thesaurus from the Web using link structure information. The proposed approach is able to identify new terms and reflect the latest relationship between terms as the Web evolves. First, a set of high quality and representative websites of a specific domain is selected. After filtering out navigational links, link analysis is applied to each website to obtain its content structure. Finally, the thesaurus is constructed by merging the content structures of the selected websites. The experimental results on automatic query expansion based on our constructed thesaurus show 20% improvement in search precision compared to the baseline.
机译:同义词库已广泛用于许多应用程序,包括信息检索,自然语言处理和问题解答。在本文中,我们提出了一种使用链接结构信息从Web自动构建特定领域词库的新颖方法。所提出的方法能够识别新的术语并反映随着Web的发展术语之间的最新关系。首先,选择一组特定领域的高质量且具有代表性的网站。在过滤掉导航链接之后,将链接分析应用于每个网站以获得其内容结构。最后,通过合并所选网站的内容结构来构建同义词库。基于我们构建的同义词库进行自动查询扩展的实验结果表明,与基准相比,搜索精度提高了20%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号