首页> 外文会议>2016 Online International Conference on Green Engineering and Technologies >A framework for measuring similarity between terms in Short Text Categorization
【24h】

A framework for measuring similarity between terms in Short Text Categorization

机译:衡量短文本分类中术语之间相似性的框架

获取原文
获取原文并翻译 | 示例

摘要

Due to the increase in the information availability on the World Wide Web, it becomes too tough for the search engine to provide the precise results for the user. Some information on the web pages is ambiguous in nature. Semantic similarity is worn to measure the similarity score for the text and it improves the efficiency of the search by obtaining the user query and process them consistent with the searcher's intent and it produces the contextual meaning of terms which generates similar results for the query. The semantic search system considers the position of words, user intention, synonyms and relationship between words to produce the correct results for the user. Generating similarity between two ideas is essential for various applications and those applications are used for producing the user satisfactory results. However, the existing approach is additionally appropriate for semantic similarity between words instead of Multi-Word Expressions (MWE) and they do not scale very well. This paper proposes a clustering and classification algorithm for semantic similarity using sample web pages. Further improvement is to analyze the short text for classification and labeling the short text according to the keyword and producing the result for the end user. This type of classification is suited for opinion mining with the tweets from twitter, topic content discovery etc.
机译:由于万维网上信息可用性的提高,对于搜索引擎来说,为用户提供准确的结果变得太困难了。网页上的某些信息本质上是模棱两可的。语义相似度用于测量文本的相似度分数,它通过获取用户查询并根据搜索者的意图进行处理来提高搜索效率,并且产生术语的上下文含义,从而为查询生成相似的结果。语义搜索系统考虑单词的位置,用户意图,同义词和单词之间的关系,以为用户提供正确的结果。在两个想法之间产生相似性对于各种应用是必不可少的,并且那些应用用于产生用户满意的结果。但是,现有方法还适合于单词之间的语义相似性,而不是适用于多单词表达(MWE),并且它们的伸缩性不是很好。本文提出了一种基于样本网页的语义相似度聚类和分类算法。进一步的改进是分析短文本以进行分类,并根据关键字对短文本进行标记,并为最终用户生成结果。这种类型的分类适用于通过Twitter推文,主题内容发现等进行观点挖掘。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号