A method may include identifying an input sequence. The input sequence or, in some cases, a fixed-length representation of the input sequence may be modified by applying a protein design computation model trained to approximate a distribution of protein sequences exhibiting certain desirable properties. The protein design computation model may include at least one energy-based model and a corresponding energy function. The at least one energy-based model may be applied to modify the input sequence while the corresponding energy function may be applied to determine the likelihood of the modified input sequence within the distribution of protein sequences exhibiting the desirable properties. An output sequence may be generated based on the modified input sequence upon determining that the likelihood of the modified input sequence within the distribution of protein sequences exhibiting the desirable properties satisfies one or more thresholds.
展开▼