Reversible speaker de-identification using pre-trained transformation functions

Magariños Carmen; Lopez-Otero Paula; Docío-Fernández Laura; Rodriguez-Banga Eduardo; Erro Daniel; Garcia-Mateo Carmen

首页> 外文期刊>Computer speech and language >Reversible speaker de-identification using pre-trained transformation functions

【24h】

Reversible speaker de-identification using pre-trained transformation functions

机译：使用预训练的变换功能可逆的说话人去识别

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Speaker de-identification approaches must accomplish three main goals: universality, naturalness and reversibility. The main drawback of the traditional approach to speaker de-identification using voice conversion techniques is its lack of universality, since a parallel corpus between the input and target speakers is necessary to train the conversion parameters. It is possible to make use of a synthetic target to overcome this issue, but this harms the naturalness of the resulting de-identified speech. Hence, a technique is proposed in this paper in which a pool of pre-trained transformations between a set of speakers is used as follows: given a new user to de-identify, its most similar speaker in this set of speakers is chosen as the source speaker, and the speaker that is the most dissimilar to the source speaker is chosen as the target speaker. Speaker similarity is measured using the i-vector paradigm, which is usually employed as an objective measure of speaker de-identification performance, leading to a system with high de-identification accuracy. The transformation method is based on frequency warping and amplitude scaling, in order to obtain natural sounding speech while masking the identity of the speaker. In addition, compared to other voice conversion approaches, the proposed method is easily reversible. Experiments were conducted on Albayzin database, and performance was evaluated in terms of objective and subjective measures. These results showed a high success when de-identifying speech, as well as a great naturalness of the transformed voices. In addition, when making the transformation parameters available to a trusted holder, it is possible to invert the de-identification procedure, hence recovering the original speaker identity. The computational cost of the proposed approach is small, making it possible to produce de-identified speech in real-time with a high level of naturalness.

机译：说话人去识别方法必须实现三个主要目标：普遍性，自然性和可逆性。传统的使用语音转换技术进行说话人识别的方法的主要缺点是缺乏通用性，因为输入和目标说话人之间必须有一个平行语料库来训练转换参数。可以使用合成目标来克服此问题，但是这会损害所得到的身份不明语音的自然性。因此，本文提出了一种技术，其中，一组说话者之间的一组预训练变换按如下方式使用：给一个新用户去识别，将其在这组说话者中最相似的说话者选为源扬声器，并且选择与源扬声器最不相似的扬声器作为目标扬声器。说话人相似度是使用i-vector范式测量的，通常将其用作说话人去识别性能的客观度量，从而导致系统具有很高的去识别精度。该变换方法基于频率扭曲和幅度缩放，以便在掩盖说话者身份的同时获得自然的语音提示。另外，与其他语音转换方法相比，该方法易于逆转。在Albayzin数据库上进行了实验，并根据客观和主观措施对性能进行了评估。这些结果表明，在取消识别语音时非常成功，而且转换后的声音具有很大的自然性。另外，当将转换参数提供给受信任的持有者时，可以反转去识别过程，从而恢复原始说话者身份。所提出的方法的计算成本很小，使得可以以高自然度实时地产生去识别语音。

著录项

来源
《Computer speech and language》 |2017年第11期|36-52|共17页
作者
Magariños Carmen; Lopez-Otero Paula; Docío-Fernández Laura; Rodriguez-Banga Eduardo; Erro Daniel; Garcia-Mateo Carmen;
展开▼
作者单位

AtlantTIC Research Center, E.E. Telecomunicación, Campus Universitario S/N, Vigo, Spain;

AtlantTIC Research Center, E.E. Telecomunicación, Campus Universitario S/N, Vigo, Spain;

AtlantTIC Research Center, E.E. Telecomunicación, Campus Universitario S/N, Vigo, Spain;

AtlantTIC Research Center, E.E. Telecomunicación, Campus Universitario S/N, Vigo, Spain;

Ikerbasque, Aholab, University of the Basque Country, Ingeniaritza Goi Eskola Teknikoa, Urkijo Zum. z/g, Bilbao, Spain;

AtlantTIC Research Center, E.E. Telecomunicación, Campus Universitario S/N, Vigo, Spain;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Amplitude scaling; Frequency warping; i-vector; Speaker de-identification; Speaker re-identification; Voice transformation;

机译：幅度缩放;频率扭曲;向量说话人取消识别;说话人重新识别;语音转换;

相似文献

外文文献
中文文献
专利

1. Extensively Reversible Thermal Transformations of a Bistable, Fluorescence-Switchable Molecular Solid: Entry into Functional Molecular Phase-Change Materials [J] . Srujana P., Radhakrishnan T. P. Angewandte Chemie . 2015,第25期

机译：双稳态，荧光可转换分子固体的广泛可逆热转变：进入功能分子相变材料
2. Reversible Solvent-Exchange-Driven Transformations in Multifunctional Coordination Polymers Based on Copper- Containing Organosulfur Ligands [J] . Almudena Gallego, Oscar Castillo, Carlos J. Gómez-García, European journal of inorganic chemistry . 2014,第24期

机译：含铜有机硫配体的多功能配位聚合物中溶剂交换驱动的可逆转化
3. Organic functional group transformations in water at elevated temperature and pressure: Reversibility, reactivity, and mechanisms [J] . Shipp J., Gould I.R., Herckes P., Geochimica et Cosmochimica Acta: Journal of the Geochemical Society and the Meteoritical Society . 2013,第Null期

机译：高温高压下水中的有机官能团转化：可逆性，反应性和机理
4. Piecewise linear definition of transformation functions for speaker de-identification [C] . Carmen Magariños, Paula Lopez-Otero, Laura Docio-Fernandez, 2016 First International Workshop on Sensing, Processing and Learning for Intelligent Machines . 2016

机译：用于说话人去识别的变换函数的分段线性定义
5. A new secure image transmission technique via secretfragment-visible mosaic images by nearly reversible color transformations. [D] . Vadde, Sandeep Kumar. 2015

机译：通过秘密片段可见的马赛克图像通过几乎可逆的颜色转换实现的一种新的安全图像传输技术。
6. Multimodal Speaker Diarization Using a Pre-Trained Audio-Visual Synchronization Model [O] . Rehan Ahmad, Syed Zubair, Hani Alquhayz, 2019

机译：使用预训练的视听同步模型进行多模态扬声器二分法
7. Speaker De-identification via Voice Transformation [O] . Qin Jin, Arthur R. Toth, Tanja Schultz, 2010

机译：通过语音转换进行说话人去识别

Reversible speaker de-identification using pre-trained transformation functions

摘要

著录项

相似文献

相关主题

期刊订阅