首页> 外国专利> Image-based approaches to classifying audio data

Image-based approaches to classifying audio data

机译：基于图像的音频数据分类方法

页面导航

摘要
著录项
相似文献

摘要

Image-based machine learning approaches are used to classify audio data, such as speech data as authentic or otherwise. For example, audio data can be obtained and a visual representation of the audio data can be generated. The visual representation can include, for example, an image such as a spectrogram or other visual or electronic representation of the audio data. Before processing the image, the audio data and/or image may undergo various preprocessing techniques. Thereafter, the image representation of the audio data can be analyzed using a trained model to classify the audio data as authentic or otherwise.

机译：基于图像的机器学习方法用于将音频数据（例如语音数据）分类为真实数据或其他。例如，可以获得音频数据并且可以生成音频数据的视觉表示。视觉表示可以包括例如图像，诸如声谱图或音频数据的其他视觉或电子表示。在处理图像之前，音频数据和/或图像可以经历各种预处理技术。此后，可以使用训练后的模型来分析音频数据的图像表示，以将音频数据分类为真实或其他。

著录项

公开/公告号US10504504B1

专利类型
公开/公告日2019-12-10

原文格式PDF
申请/专利权人 VOCALID INC.;
展开▼

申请/专利号US201816213388
发明设计人 GEOFFREY S MELTZNER;RUPAL PATEL;MARKUS TOMAN;
展开▼

申请日2018-12-07
分类号G10L15/06;G10L15/22;G10L25/18;G10L25/24;
国家 US
入库时间 2022-08-21 11:23:41

相似文献

专利
外文文献
中文文献