Estimation of speaker position using audio information

机译：使用音频信息估计扬声器位置

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Real-time conversational video telecommunications services, such as video-conferencing, are becoming ever more important as a substitute for face-to-face meetings. One of the perceived weaknesses of existing services is the picture quality achieved, especially around the face of a speaker. A possible solution would be to identify the location of face, which is then transmitted at a higher quality than the rest of the picture. In this paper, we present a new technique for identifying the face using an array of microphones. As opposed to other techniques proposed so far, which make assumptions about the content of the video material, the idea relies on the estimation of lip position based on the audio processing from the speaker's speech. Once this estimation is performed, then a two or possibly three stage quantisation on video information will facilitate the compression of the subjectively more important parts, i.e. the face of a speaker with lower distortion. This new technique, which is compatible with all existing video compression standards, is much cheaper and easier to implement than previous techniques.

机译：实时对话视频电信服务，如视频会议，变得更加重要，作为面对面会议的替代品。现有服务的感知弱点之一是实现的图像质量，特别是在扬声器的脸上。可能的解决方案是识别面部的位置，然后以比图像的其余部分更高的质量传输。在本文中，我们介绍了一种使用麦克风阵列识别面部的新技术。与到目前为止所提出的其他技术相反，这使得关于视频材料的内容的假设，该想法依赖于基于扬声器语音的音频处理的唇部位置的估计。一旦执行该估计，那么在视频信息上的两个或可能三个阶段量化将促进主题更重要的部分的压缩，即扬声器具有较低失真的扬声器。这种与所有现有的视频压缩标准兼容的新技术比以前的技术更便宜，更容易实现。

著录项

来源
《IEEE Region 10 Annual Conference》|1997年||共4页
会议地点
作者
Vahedian A.; Frater M.; Institute of Electric and Electronic Engineer;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Estimation of speaker and listener positions in a car using binaural signals [J] . Madoka Takimoto, Takanori Nishino, Hiroyuki Hoshino, Acoustical science and technology . 2008,第1期

机译：使用双耳信号估计汽车中说话者和听者的位置
2. Estimation of speaker and listener positions in a car using binaural signals [J] . Hiroyuki Hoshino, Kazuya Takeda, Madoka Takimoto, Acoustical science and technology . 2008,第1期

机译：使用双耳信号估计汽车中说话者和听者的位置
3. Auditory and visual information integration using Bayesian networks for speaker's position estimation [J] . Yoichi Motomura, Futoshi Asano, Hideki Asoh, 電子情報通信学会技術研究報告. ヒュ-マン情報処理. Human Information Processing . 2002,第595期

机译：使用贝叶斯网络进行听觉和视觉信息集成，以评估说话者的位置
4. Estimation of speaker position using audio information [C] . Vahedian, A., Frater, . 1997

机译：使用音频信息估计扬声器位置
5. Robust Speaker Modeling in Non-Neutral Environments with Application to Large Scale Multi-Speaker Audio Streams [D] . Yu, Chengzhu. 2017

机译：非中性环境中的鲁棒扬声器建模及其在大规模多扬声器音频流中的应用
6. Speech Audiometry at Home: Automated Listening Tests via Smart Speakers With Normal-Hearing and Hearing-Impaired Listeners [O] . Jasper Ooster, Melanie Krueger, Jörg-Hendrik Bach, 2020

机译：主页言语听力测量：通过智能扬声器自动聆听测试具有正常听力和听力受损的听众
7. Three-dimensional Speaker Localization: Audio-refined Visual Scaling Factor Estimation [O] . Xinyuan Qian, Qi Liu, Jiadong Wang, 2021

机译：三维扬声器定位：音频精制视觉缩放因子估计

Estimation of speaker position using audio information

摘要

著录项

相似文献

相关主题

期刊订阅