Disclosed are techniques for generating a language model that is applicable to the interpretation of commands for invoking application-based actions via a digital assistant device. In various embodiments, command templates that are each mapped to one of a plurality of action datasets are obtained to generate synthetic documents of a language model document corpus. Each synthetic document can be modified to include a tag that corresponds to an associated command template from which the document's generation was based. The language model can include a plurality of document clusters that are generated based on the modified synthetic documents, among other things. In this way, when a command is received from a digital assistant device, a relevant set of modified synthetic documents from the generated plurality of document clusters can be identified, and one of the plurality of action datasets mapped to one of the plurality of defined commands determined to correspond to the tag included in a determined most relevant modified synthetic document can be selected.
展开▼