首页>
外国专利>
Systems and techniques to monitor text data quality
Systems and techniques to monitor text data quality
展开▼
机译:监控文本数据质量的系统和技术
展开▼
页面导航
摘要
著录项
相似文献
摘要
Disclosed are a system, apparatus and techniques for evaluating a dataset to confirm that the data in the dataset satisfies a data quality metric. A machine learning engine or the like may evaluate text strings within the dataset may be of arbitrary length and encoded according to an encoding standard. Data vectors of a preset length may be generated from the evaluated text strings using various techniques. Each data vector may be representative of the content of the text string and a category may be assigned to the respective data vector. The category assigned to each data vectors may be evaluated with respect to other data vectors in the dataset to determine compliance with a quality metric. In the case that a number of data vectors fail to meet a predetermined quality metric, an alert may be generated to mitigate any system errors that may result from unsatisfactory data quality.
展开▼