首页>
外国专利>
SYSTEMS AND METHODS FOR A SCALABLE CONTINUOUS ACTIVE LEARNING APPROACH TO INFORMATION CLASSIFICATION
SYSTEMS AND METHODS FOR A SCALABLE CONTINUOUS ACTIVE LEARNING APPROACH TO INFORMATION CLASSIFICATION
展开▼
机译:用于信息分类的可扩展的连续主动学习方法的系统和方法
展开▼
页面导航
摘要
著录项
相似文献
摘要
Systems and methods for classifying electronic information are provided by way of a Technology-Assisted Review (“TAR”) process. In certain embodiments, the TAR process is a Scalable Continuous Active Learning (“S-CAL”) approach. In certain embodiments, S-CAL selects an initial sample from a document collection, trains a classifier by using a default classification for a portion of the initial sample, scores the initial sample, selects a sub-sample from the initial sample for review, removes the reviewed sub-sample from the initial sample, and repeats the process by re-training the classifier until the initial sample is exhausted. In certain embodiments, a classification threshold is determined using a calculated estimate of the prevalence of relevant information such that the threshold classifies the information in accordance with a determined target criteria. In certain embodiments, the estimate of prevalence is determined from the results of iterations of a TAR process such as S-CAL.
展开▼