...
首页> 外文期刊>Journal of computational and theoretical nanoscience >Multi Level Web Data Extraction Based Topical Visual Structure Clustering for Efficient Web Search
【24h】

Multi Level Web Data Extraction Based Topical Visual Structure Clustering for Efficient Web Search

机译:基于多级Web数据提取的高效网络搜索的局部视觉结构聚类

获取原文
获取原文并翻译 | 示例
           

摘要

The problem of web clustering has been approached in various strategies; however they suffer to achieve performance due to the poor data extraction and clustering approaches used. The most methods do not use all the features of web document than the textual features. To improve theperformance of web data extraction and clustering to support web search, the author present an efficient multi level web data extraction technique in this paper. The web document has been preprocessed to obtain various features from the text, structural and visual features. Extracted featureshave been used to perform Topical-Visual-Structure Clustering. Each class of the cluster has been organized into three different sub classes. The method first computes the topical similarity measure to identify the class of the document. Then the visual similarity and structural similaritymeasure has been used to identify the next level subclass of each cluster. The method focused to improve the performance of the web search and from the input query the method identifies the type of result the user expects. The proposed method increases the performance of web data extractionand web search.
机译:Web集群的问题已在各种策略中接近;然而,由于使用的数据提取和使用的聚类方法差,它们遭受了实现的性能。最多的方法不使用Web文档的所有功能而不是文本功能。为了提高Web数据提取和聚类的性能来支持网络搜索,提交人在本文中提出了一种有效的多级Web数据提取技术。已预处理Web文档以获取文本,结构和视觉功能的各种功能。提取的特性已被用于执行局部视觉结构聚类。群集的每个类都被组织成三个不同的子类。该方法首先计算局部相似度量以识别文档的类别。然后,视觉相似性和结构相似性已经用于标识每个群集的下一个级别子类。该方法的重点是提高Web搜索的性能以及从输入查询的方法该方法标识用户期望的结果类型。该方法提高了Web数据提取的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号