首页>
外国专利>
Enhancing hybrid self-attention structure with relative-position-aware bias for speech synthesis
Enhancing hybrid self-attention structure with relative-position-aware bias for speech synthesis
展开▼
机译:用语音合成的相对位置感知偏差增强混合自我关注结构
展开▼
页面导航
摘要
著录项
相似文献
摘要
A method of performing speech synthesis, includes encoding character embeddings, using any one or any combination of convolutional neural networks (CNNs) and recurrent neural networks (RNNs), applying a relative-position-aware self attention function to each of the character embeddings and an input mel-scale spectrogram, and encoding the character embeddings to which the relative-position-aware self attention function is applied. The method further includes concatenating the encoded character embeddings and the encoded character embeddings to which the relative-position-aware self attention function is applied, to generate an encoder output, applying a multi-head attention function to the encoder output and the input mel-scale spectrogram to which the relative-position-aware self attention function is applied, and predicting an output mel-scale spectrogram, based on the encoder output and the input mel-scale spectrogram to which the multi-head attention function is applied.
展开▼