A method, a computer readable medium, and a computer system are provided for singing voice conversion. Data corresponding to a singing voice is received. One or more features and pitch data are extracted from the received data using one or more adversarial neural networks. One or more audio samples are generated based on the extracted pitch data and the one or more features.
展开▼