首页>
外国专利>
SIAMESE NEURAL NETWORKS FOR FLAGGING TRAINING DATA IN TEXT-BASED MACHINE LEARNING
SIAMESE NEURAL NETWORKS FOR FLAGGING TRAINING DATA IN TEXT-BASED MACHINE LEARNING
展开▼
机译:基于文本的机器学习中标记训练数据的暹罗神经网络
展开▼
页面导航
摘要
著录项
相似文献
摘要
Techniques performed by a data processing system for analyzing training data for a machine learning model and identifying outliers in the training data herein include obtaining training data for the model from a memory of the data processing system; analyzing the training data using a Siamese Neural Network to determine within-label similarities and cross-label similarities associated with a plurality of data elements within the training data, the within-label representing similarities between a respective data element and a first set of data elements similarly labeled in the training data, the cross-label similarities representing similarities between the respective data element and a second set of data elements dissimilarly labeled in the training data; identifying outlier data elements in the plurality of data elements based on the within-label and cross-label similarities; and processing the training data comprising the outlier data elements. Processing may include deleting the outlier data elements or generating a report.
展开▼