首页> 外文会议>IEEE International Conference on Web Services >Web-service Clustering with a Hybrid of Ontology Learning and Information-retrieval-based Term Similarity
【24h】

Web-service Clustering with a Hybrid of Ontology Learning and Information-retrieval-based Term Similarity

机译:Web服务聚类与本体学习的混合和基于信息检索的术语相似性

获取原文
获取外文期刊封面目录资料

摘要

Organizing Web services into functionally similar clusters, is an efficient approach to discovering Web services efficiently. An important aspect of the clustering process is calculating the semantic similarity of Web services. Most current clustering approaches are based on similarity-distance measurement, including keyword, ontology and information-retrieval-based methods. Problems with these approaches include a shortage of high quality ontologies and a loss of semantic information. In addition, there has been little finegrained improvement in existing approaches to service clustering. In this paper, we present a new approach to grouping Web services into functionally similar clusters by mining Web service documents and generating an ontology via hidden semantic patterns present within the complex terms used in service features to measure similarity. If calculating the similarity using the generated ontology fails, the similarity is calculated by using an information-retrieval-based term-similarity method that adopts term-similarity measuring techniques used by thesaurus and search engines. Another important aspect of high performance in clustering is identifying the most suitable cluster center. To improve the utility of clusters, we propose an approach to identifying the cluster center that combines service similarity with the term frequency-inverse document frequency values of service names. Experimental results show that our clustering approach performs better than existing approaches.
机译:将Web服务组织到功能相似的群集中,是有效地发现Web服务的有效方法。聚类过程的一个重要方面是计算Web服务的语义相似性。大多数当前的聚类方法基于相似度 - 距离测量,包括关键字,本体和基于信息检索的方法。这些方法的问题包括高质量本体的短缺和语义信息的损失。此外,现有的服务聚类方法几乎没有细化的改进。在本文中,我们介绍了通过挖掘Web服务文档将Web服务分组到功能相似的群集中的新方法,并通过在服务特征中使用的复杂术语中存在的隐藏语义模式来生成本体来测量相似性。如果使用所生成的本体故障计算相似性,则通过使用基于信息检索的术语相似性方法计算相似性,该术语相似性方法采用代谢物和搜索引擎使用的术语相似度测量技术。聚类中高性能的另一个重要方面是识别最合适的集群中心。为了提高集群的效用,我们提出了一种方法来识别与服务名称的术语频率 - 逆文档频率值相结合的群集中心。实验结果表明,我们的聚类方法比现有方法更好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号