首页> 外文会议>International Conference on Computational Science and Its Applications;ICCSA 2008 >A Method for Automatic Text Categorization Using Word Sense Disambiguation
【24h】

A Method for Automatic Text Categorization Using Word Sense Disambiguation

机译:一种基于词义消歧的文本自动分类方法

获取原文

摘要

At present time, Information plays a relevant role in current societies. In this context, Internet is one of the most extended mechanisms to communicate and distribute information around the word. Today, due to the extremely large number of information sources, automatic mechanisms are needed to filter the information that could be useful for each user. However, one of the problems that the usual techniques of automatic text categorization have not been able to handle is polysemy (words with two o more senses). In this paper, we have faced this problem by proposing a semantic analyzer for the automatic categorization of texts in Spanish. Context exploration techniques were used as a key mechanism for guiding the disambiguation process. A specific lexical database and its existing semantic relations fulfilled the objective of appropriately categorizing the analyzed text. To validate this analyzer, a tool was developed that classifies web pages by semantic sense. We present performance results for this classifier. Finally, a comparison with four other classification tools is reported.
机译:目前,信息在当前社会中起着重要的作用。在这种情况下,Internet是围绕单词交流和分发信息的最扩展的机制之一。如今,由于信息源数量众多,需要自动机制来过滤可能对每个用户有用的信息。但是,自动文本分类的常规技术无法解决的问题之一是多义性(具有两种或多种意义的单词)。在本文中,我们提出了一种语义分析器来对西班牙语中的文本进行自动分类,从而解决了这一问题。上下文探索技术被用作指导消除歧义过程的关键机制。一个特定的词汇数据库及其现有的语义关系实现了对所分析文本进行适当分类的目的。为了验证此分析器,开发了一种工具,可以通过语义对网页进行分类。我们介绍了此分类器的效果结果。最后,报告了与其他四个分类工具的比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号