首页>
外国专利>
Training corpus generation methods, devices, equipment and storage media
Training corpus generation methods, devices, equipment and storage media
展开▼
机译:训练语料库生成方法,设备,设备和存储介质
展开▼
页面导航
摘要
著录项
相似文献
摘要
PROBLEM TO BE SOLVED: To effectively improve the effect of speech recognition, significantly shorten the iterative cycle of a speech recognition model, and save a large amount of resources. A method of generating a training corpus is to mine a plurality of labeled corpus data in a user behavior log associated with a target application program, and to perform a first behavior log and a second behavior log of each labeled corpus data. Based on the association with the action log, the user voice and the corresponding voice recognition result in each corpus data are judged as positive or negative feedback training corpus. The corpus data includes a first action log containing the user's voice and the corresponding voice recognition result, and a second action log that is temporally associated with it and belongs to the same user. Based on user behavior, the speech-recognized positive and negative feedback training corpus is automatically and intentionally mined to provide training for subsequent speech recognition models. [Selection diagram] Fig. 1
展开▼