首页> 外文会议> >Speaker identification based text to audio alignment for an audio retrieval system

【24h】

Speaker identification based text to audio alignment for an audio retrieval system

机译：基于说话人识别的文本到音频检索系统的音频对齐

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We report on an audio retrieval system which lets Internet users efficiently access a large audio database containing recordings of the proceedings of the United States House of Representatives. The audio has been temporally aligned to text transcripts of the proceedings (which are manually generated by the US Government) using a novel method based on speaker identification. Speaker sequence and approximate timing information is extracted from the text transcript and used to constrain a Viterbi alignment of speaker models to the observed audio. Speakers are modeled by computing Gaussian statistics of cepstral coefficients extracted from samples of each person's speech. The speaker identification is used to locate speaker transition points in the audio which are then linked to corresponding speaker transitions in the text transcript. The alignment system has been successfully integrated into a World Wide Web based search and browse system as an experimental service on the Internet.

机译：我们报告了一个音频检索系统，该系统可使Internet用户有效地访问包含美国众议院议事记录的大型音频数据库。使用基于说话者识别的新颖方法，音频已在时间上与会议记录的文字记录（由美国政府手动生成）对齐。从文本记录中提取说话者序列和大概的时间信息，并将其用于约束说话者模型与观察到的音频的维特比对齐。通过计算从每个人的语音样本中提取的倒谱系数的高斯统计量来对说话者进行建模。说话人标识用于定位音频中的说话人过渡点，然后将其链接到文本抄本中的相应说话人过渡。对准系统已成功集成到基于Internet的搜索和浏览系统中，作为Internet上的一项实验性服务。

著录项

来源
《》|1997年|P.1099-1102|共4页
会议地点
作者
Roy; D.; Malamud; C.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类无线电电子学、电信技术;
关键词

相似文献

外文文献
中文文献
专利

1. Audio Keywords Discovery for Text-Like Audio Content Analysis and Retrieval [J] . Lu L., Hanjalic A. IEEE transactions on multimedia . 2008,第1期

机译：音频关键字发现，可进行类似文本的音频内容分析和检索
2. A score identification parallel system based on audio-to-score alignment [J] . Munoz-Montoro A. J., Cortina R., Garcia-Galan S., Journal of supercomputing . 2020,第11期

机译：基于音频到分数对齐的分数识别并行系统
3. Arabic Audio News Retrieval System Using Dependent Speaker Mode, Mel Frequency Cepstral Coefficient and Dynamic Time Warping Techniques [J] . Hasan Muaidi, Ayat Al-Ahmad, Thaer Khdoor, Research journal of applied science, engineering and technology . 2014,第24期

机译：阿拉伯音频新闻检索系统，使用相关的扬声器模式，梅尔频率倒谱系数和动态时间扭曲技术
4. Speaker identification based text to audio alignment for an audio retrieval system [C] . Roy D., Malamud C., Institute of Electric and Electronic Engineer IEEE International Conference on Acoustics, Speech, and Signal Processing . 1997

机译：基于扬声器识别的文本到音频检索系统的音频对齐
5. Automatic segmentation, indexing and retrieval of audiovisual data based on combined audio and visual content analysis. [D] . Zhang, Tong. 1999

机译：基于组合的视听内容分析，对视听数据进行自动分段，索引和检索。
6. A comparison of text versus audio for information comprehension with future uses for smart speakers [O] . Gondy Leroy, David Kauchak 2019

机译：文本和音频的信息理解比较以及智能扬声器的未来用途
7. Speaker Identification Based Text To Audio Alignment For An Audio Retrieval System [O] . Deb Roy, Carl Malamud 1997

机译：基于说话人识别的文本到音频对齐的音频检索系统

Speaker identification based text to audio alignment for an audio retrieval system

摘要

著录项

相似文献

相关主题

期刊订阅