A computer-implemented method for dialog state tracking employs first and second latent variable models which have been learned by reconstructing a decompositional model generated from annotated training dialogs. The decompositional model includes, for each of a plurality of dialog state transitions corresponding to a respective turn of one of the training dialogs, state descriptors for initial and final states of the transition and a respective representation of the dialog for that turn. The first latent variable model includes embeddings of the plurality of state transitions, and the second latent variable model includes embeddings of features of the state descriptors and embeddings of features of the dialog representations. Data for a new dialog state transition is received, including a state descriptor for the initial time and a respective dialog representation. A state descriptor for the final state of the new dialog state transition is predicted using the learned latent variable models.
展开▼