首页> 外国专利> METHOD AND SYSTEM FOR CLASSIFICATION OF VENUE BY ANALYZING DATA FROM VENUE WEBSITE

METHOD AND SYSTEM FOR CLASSIFICATION OF VENUE BY ANALYZING DATA FROM VENUE WEBSITE

机译:通过场地网站数据分析场地分类的方法和系统

摘要

A method and system classifies a venue by analyzing venue data from a venue website. The method includes receiving preliminary venue-related data. The method includes scanning the venue website to retrieve venue data, wherein scanning the venue website includes retrieving the venue data from HTML pages, text documents, PDF documents, and images. The method includes retrieving verifiable venue data from the venue data. The verifiable venue data is a subset of the venue data. The method includes analyzing the verifiable venue data by comparing the verifiable venue data to the preliminary venue-related data and determining a probability level for the venue URL from the comparison. If the probability level for the venue URL is equal or greater than a first probability level, the venue website data is further analyzed to extract attributes and attribute counts in a robust and context-sensitive way. The method includes determining the percentage of the attribute representation from the total number of preselected attributes in the venue data and classifying the venue based on the percentage of the attribute representation.
机译:一种方法和系统通过分析来自场地网站的场地数据来对场地进行分类。该方法包括接收初步的场地相关数据。该方法包括扫描场所网站以检索场所数据,其中扫描场所网站包括从HTML页面,文本文档,PDF文档和图像中检索场所数据。该方法包括从场所数据中检索可验证的场所数据。可验证的场所数据是场所数据的子集。该方法包括通过将可验证的场地数据与初步的场地相关数据进行比较来分析可验证的场地数据,并从比较中确定场地URL的概率水平。如果场所URL的概率级别等于或大于第一概率级别,则进一步分析场所网站数据以以健壮且上下文敏感的方式提取属性和属性计数。该方法包括从场所数据中的预选属性的总数中确定属性表示的百分比,并基于属性表示的百分比对场所进行分类。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号