首页> 外文会议> >Cepstral Domain Modification of Audio Signals for Data Embedding -Preliminary Results

【24h】

Cepstral Domain Modification of Audio Signals for Data Embedding -Preliminary Results

机译：用于数据嵌入的音频信号的倒谱域修改-初步结果

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

A method of embedding data in an audio signal using cepstral domain modification is described. Based on successful embedding in the spectral points of perceptually masked regions in each frame of speech, first the technique was extended to embedding in the log spectral domain. This extension resulted at approximately 62 bits /s of embedding with less than 2 percent of bit error rate (BER) for a clean cover speech (from the TIMIT database), and about 2.5 percent for a noisy speech (from an air traffic controller database), when all frames - including silence and transition between voiced and unvoiced segments - were used. Bit error rate increased significantly when the log spectrum in the vicinity of a formant was modified. In the next procedure, embedding by altering the mean cepstral values of two ranges of indices was studied. Tests on both a noisy utterance and a clean utterance indicated barely noticeable perceptual change in speech quality when lower range of cepstral indices - corresponding to vocal tract region - was modified in accordance with data. With an embedding capacity of approximately 62 bits /s - using one bit per each frame regardless of frame energy or type of speech - initial results showed a BER of less than 1.5 percent for a payload capacity of 208 embedded bits using the clean cover speech. BER of less than 1.3 percent resulted for the noisy host with a capacity was 316 bits. When the cepstrum was modified in the region of excitation, BER increased to over 10 percent. With quantization causing no significant problem, the technique warrants further studies with different cepstral ranges and sizes. Pitch-synchronous cepstrum modification, for example, may be more robust to attacks. In addition, cepstrum modification in regions of speech that are perceptually masked - analogous to embedding in frequency masked regions - may yield imperceptible stego audio with low BER.

机译：描述了一种使用倒频谱域修改将数据嵌入音频信号中的方法。基于成功嵌入到每个语音帧中的感知蒙版区域的频谱点中，首先将该技术扩展到嵌入对数频谱域。这种扩展的结果是，大约62位/ s的嵌入时间（用于TIFF数据库的纯净掩盖语音）的误码率（BER）不到2％，而对于嘈杂的语音（来自空中交通管制员数据库）的误码率大约为2.5％），则使用所有帧（包括静音和有声和无声段之间的过渡）时。当修改共振峰附近的对数谱时，误码率显着增加。在下一个过程中，研究了通过更改两个索引范围的平均倒谱值进行嵌入。对嘈杂发声和干净发声的测试表明，当根据数据修改较低的倒谱指数范围（对应于声道区域）时，语音质量几乎没有明显的知觉变化。嵌入能力约为62位/秒-每帧使用一个位，而与帧能量或语音类型无关-初始结果显示，对于使用干净覆盖语音的208个嵌入位的有效负载容量，BER小于1.5％。对于容量为316位的嘈杂主机，BER不到1.3％。当在激发区域改变倒频谱时，BER增加到10％以上。由于量化不会引起重大问题，因此该技术值得对不同倒谱范围和大小进行进一步研究。例如，音高同步倒谱修改可能对攻击更健壮。另外，在语音被感知掩盖的区域中的倒频谱修改（类似于嵌入在频率掩盖的区域中）可能会产生低BER的隐蔽的隐身音频。

著录项

来源
《》|2004年|P.151-161|共11页
会议地点 San Jose CA(US)
作者
K. Gopalan;
展开▼
作者单位

Department of Engineering Purdue University Calumet Hammond, IN 46323;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类图像信号处理;计算机的应用;
关键词
audio embedding; cepstrum modification; perceptual masking; watermarking;

机译：音频嵌入;倒频谱修改;感知掩蔽;水印;

相似文献

外文文献
中文文献
专利

1. A preliminary study of combining mass spectrometric data with audio and video signals for real-time monitoring of controlled lab-scale fires [J] . M. Statheropoulos, K. Mikedi, P. Stavrakakis, Sensors and Actuators . 2011,第1期

机译：将质谱数据与音频和视频信号相结合以对受控实验室规模火灾进行实时监控的初步研究
2. Patent Issued for Signal Processing of Audio and Video Data, Including Assessment of Embedded Data [J] . Journal of Engineering . 2013,第12期

机译：针对音频和视频数据的信号处理，包括对嵌入式数据的评估，已颁发专利
3. Data embedding in audio using time-scale modification [J] . Mansour M.F., Tewfik A.H. IEEE Transactions on Speech and Audio Proceessing . 2005,第3期

机译：使用时标修改将数据嵌入音频
4. Cepstral Domain Modification of Audio Signals for Data Embedding -Preliminary Results [C] . K. Gopalan, SPIE-The International Society for Optical Engineering, SPIE v.5306 Conference on security, steganography, and watermarking of multimedia contents . 2004

机译：Cepstral域修改音频信号，用于数据嵌入 - 初始结果
5. Time scale modification of digital audio signals and its applications. [D] . Liu, Fang. 2004

机译：数字音频信号的时标修改及其应用。
6. Analysis of Smartphone Recordings in Time Frequency and Cepstral Domains to Classify Parkinson’s Disease [O] . Ilias Tougui, Abdelilah Jilbab, Jamal El Mhamdi 2020

机译：时间频率和抗搏动结构域的智能手机录制分析分类帕金森病
7. MLSP 2007 DATA ANALYSIS COMPETITION: FREQUENCY-DOMAIN BLIND SOURCE SEPARATION FOR CONVOLUTIVE MIXTURES OF SPEECH/AUDIO SIGNALS [O] . Hiroshi Sawada, Shoko Araki, Shoji Makino 2008

机译：MLSP 2007数据分析竞争：语音/音频信号的混合混合的频域盲源分离

Cepstral Domain Modification of Audio Signals for Data Embedding -Preliminary Results

摘要

著录项

相似文献

相关主题

期刊订阅