A processor implemented method to update a sentence generation model includes: generating a target sentence corresponding to a source sentence using a first decoding model; calculating reward information associated with the target sentence using a second decoding model configured to generate a sentence in an order different from an order of the sentence generated by the first decoding model; and generating an updated sentence generation model by resetting a weight of respective nodes in the first decoding model based on the calculated reward information.
展开▼