首页> 外文会议>Computing in Civil Engineering >Constructing the Civil Engineering Thesaurus (CET) Using ThesWB
【24h】

Constructing the Civil Engineering Thesaurus (CET) Using ThesWB

机译:使用ThesWB构建土木工程词库(CET)

获取原文

摘要

This paper describes a method used to construct a thesaurus in the field of civil engineering. This work is an effort to investigate the potential of thesauri as a tool for information retrieval systems and as an aid in civil engineering. ThesWB, a tool that extracts terms and relations between them from HTML documents, was used for collecting candidate thesaurus terms from Web. The principal advantage of the Web as a source for thesaurus construction is that it can be viewed as a body of text containing two fundamentally different types of data: the contents and the tags. A tag in HTML is meta-data describing the layout and linking structure between the texts. For these kinds of documents we can apply different approaches to extract and structure terms automatically. ThesWB is used to construct domain independent thesaurus from HTML pages.
机译:本文介绍了一种在土木工程领域中用于构建同义词库的方法。这项工作是为了研究叙词表作为信息检索系统工具和土木工程辅助工具的潜力。 ThesWB是一种从HTML文档中提取术语和它们之间的关系的工具,用于从Web收集候选词库术语。 Web作为同义词库构建源的主要优点是可以将其视为包含两种根本不同类型的数据的文本主体:内容和标签。 HTML中的标记是描述文本之间的布局和链接结构的元数据。对于这些类型的文档,我们可以应用不同的方法来自动提取和构建术语。 ThesWB用于从HTML页面构建与域无关的词库。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号