首页> 外文会议>International Conference on Information and Knowledge Management >Yahoo! as an Ontology - Using Yahoo! Categories to Describe Documents
【24h】

Yahoo! as an Ontology - Using Yahoo! Categories to Describe Documents

机译:雅虎作为一个本体 - 使用雅虎!描述文件的类别

获取原文

摘要

We suggest that one (or a collection) of names of Yahoo! (or any other WWW indexer's) categories can be used to describe the content of a document. Such categories offer a standardized and universal way for referring to or describing the nature of real world objects, activities, documents and so on, and may be used (we suggest) to semantically characterize the content of documents. WWW indices, like Yahoo! provide a huge hierarchy of categories (topics) that touch every aspect of human endeavors. Such topics can be used as descriptors, similarly to the way librarians use for example, the Library of Congress cataloging system to annotate and categorize books. In the course of investigating this idea, we address the problem of automatic categorization of web-pages in the Yahoo! directory. We use Telltale as our classifier; Telltale uses n-grams to compute the similarity between documents. We experiment with various types of descriptions for the Yahoo! categories and the webpages to be categorized. Our findings suggest that the best results occur when using the very brief descriptions of the Yahoo! categorized entries; these brief descriptions are provided either by the entries' submitters or by the Yahoo! human indexers and accompany most Yahoo! indexed entries.
机译:我们建议雅虎的名字(或收集)! (或任何其他www indexer的)类别可用于描述文档的内容。这些类别提供了一种标准化和普遍的方式,用于参考或描述现实世界对象,活动,文件等的性质,并且可以使用(我们建议)在语义上表征文档的内容。 www indices,如yahoo!提供触及人类努力的各个方面的巨大类别(主题)的巨大层次。此类主题可以用作描述符,类似于图书馆员使用的方式,例如,国会编目系统库注释和分类书籍。在调查这个想法的过程中,我们解决了雅虎的网页自动分类问题。目录。我们使用Telltale作为我们的分类器; TellteGe使用N-Gram来计算文档之间的相似性。我们试验雅虎的各种类型的描述类别和要分类的网页。我们的研究结果表明,使用雅虎的非常简短的描述时会出现最佳结果!分类条目;这些简短的描述由条目的提交者或雅虎提供。人类索引和大多数雅虎!索引条目。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号