This revelation provides a common automatic language recognition and text-to-language conversion using enemy neural networks. This script reveals a comprehensive Deep Learning-based system that can solve both ASR and TTS problems together using uncoupled text and audio samples. An adversely trained approach is used to generate a more robust independent neuronal TTS network and a neuronal ASR network,which can be used individually or simultaneously. The process of training the neural networks involves generating an audio probe using a text sample using the neural TTS network and then feeding the generated audio probe into the neural ASR network to regenerate the text. The difference between the regenerated text and theRecent text is used as a first loss to train the neural networks. A similar process is used for an audio call. The difference between the regenerated audio and the original audio is used as a second loss. A text and an audio critic are applied in a similar way to the output of the neural networkto generate additional losses for training.
展开▼