首页> 外国专利> SYSTEMS AND METHODS FOR MACHINE LEARNING-BASED MULTI-INTENT SEGMENTATION AND CLASSIFICATION

SYSTEMS AND METHODS FOR MACHINE LEARNING-BASED MULTI-INTENT SEGMENTATION AND CLASSIFICATION

机译:基于机器学习的多点细分和分类的系统和方法

摘要

Systems and methods for synthesizing training data for multi-intent utterance segmentation include identifying a first corpus of utterances comprising a plurality of distinct single-intent in-domain utterances; identifying a second corpus of utterances comprising a plurality of distinct single-intent out-of-domain utterances; identifying a third corpus comprising a plurality of distinct conjunction terms; forming a multi-intent training corpus comprising synthetic multi-intent utterances, wherein forming each distinct multi-intent utterance includes: selecting a first distinct in-domain utterance from the first corpus of utterances; probabilistically selecting one of a first out-of-domain utterance from the second corpus and a second in-domain utterance from the first corpus; probabilistically selecting or not selecting a distinct conjunction term from the third corpus; and forming a synthetic multi-intent utterance including appending the first in-domain utterance with one of the first out-of-domain utterance from the second corpus of utterances and the second in-domain utterance from the first corpus of utterances.
机译:用于为多意图话语分段合成训练数据的系统和方法包括:识别包括多个不同的单意图域内话语的第一言语语料库;识别包括多个不同的单意图域外话语的第二话语语料库;识别包括多个不同的连词的第三语料;形成包括合成多意图话语的多意图训练语料库,其中形成每个不同的多意图话语包括:从第一话语语料库中选择第一不同的域内话语;从第二语料库中选择第一域外话语和从第一语料库中选择第二域内话语中的一个;从第三语料库中概率选择或不选择不同的连词;以及形成合成的多意图话语,包括在第一域内话语中附加第二话语语料库中的第一域外话语和在第二话语体中的第二域内话语中的一个。

著录项

获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号