首页>
外国专利>
System and method for identifying structured data items lacking requisite information for rule-based duplicate detection
System and method for identifying structured data items lacking requisite information for rule-based duplicate detection
展开▼
机译:用于识别缺少必要信息以进行基于规则的重复检测的结构化数据项的系统和方法
展开▼
页面导航
摘要
著录项
相似文献
摘要
Embodiments of a system and method for identifying structured data items lacking requisite information for rule-based duplicate detection are described. Embodiments may include generating a deficiency score for each of multiple structured data items including applying a set of rules based on duplicate detection techniques to each given structured data item in order to perform a comparison of the given structured data item to itself. The deficiency score of the given structured data item may be based on a result of the comparison. Embodiments may also include, based on the deficiency scores of the structured data items, identifying one or more deficient structured data items having less than a requisite quantity of information for performing duplicate detection on structured data items. Embodiments may also include identifying one or more key attributes missing from some of the one or more deficient structured data items and requesting those key attributes.
展开▼