首页>
外国专利>
Ground Truth Generation for Machine Learning Based Quality Assessment of Corpora
Ground Truth Generation for Machine Learning Based Quality Assessment of Corpora
展开▼
机译:基于机器学习的语料库质量评估的地面真理生成
展开▼
页面导航
摘要
著录项
相似文献
摘要
A mechanism is provided in a computing device configured with instructions executing on a processor of the computing device to implement a ground truth generation system for quality assessment scoring of articles in a corpus. The ground truth generation system receives recommendations of a set of recommended articles from subject matter experts. The ground truth generation system identifies a set of non-recommended articles. A topic clustering component within the ground truth generation system performs topic clustering on a combination of the set of recommended articles and the set of non-recommended articles to form a set of topic clusters containing recommended articles and non-recommended articles. The ground truth generation system identifies a first number of recommended articles and a second number of non-recommended articles in each of the set of topic clusters to form a quality assessment training set. The mechanism trains a quality assessment machine learning model using the quality assessment training set.
展开▼