A computer-implemented method for dialog state tracking employs first and second latent variable models which have been learned by reconstructing a decompositional model generated from annotated training dialogues. The decompositional model includes, for each of a plurality of dialog state transitions corresponding to a respective turn of one of the training dialogues, state descriptors for initial and final states of the transition and a respective representation of the dialogue for that turn. The first latent variable model includes embeddings of the plurality of state transitions, and the second latent variable model includes embeddings of features of the state descriptors and embeddings of features of the dialogue representations. Data for a new dialog state transition is received, including a state descriptor for the initial time and a respective dialogue representation. A state descriptor for the final state of the new dialog state transition is predicted using the learned latent variable models.
展开▼