首页> 外文会议>Chinese Spoken Language Processing; Lecture Notes in Artificial Intelligence; 4274 >Development of Multi-lingual Spoken Corpora of Indian Languages

【24h】

Development of Multi-lingual Spoken Corpora of Indian Languages

机译：印度语多语种语料库的发展

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper describes a recently initiated effort for collection and transcription of read as well as spontaneous speech data in four Indian languages. The completed preparatory work include the design of phonetically rich sentences, data acquisition setup for recording speech data over telephone channel, a Wizard of Oz setup for acquiring speech data of a spoken dialogue of a caller with the machine in the context of a remote information retrieval task. An account of care taken to collect speech data that is as close to real world as possible is given. The current status of the programme and the set of actions planned to achieve the goal is given.

机译：本文介绍了最近启动的以四种印度语言收集和转录阅读以及自发语音数据的工作。完成的准备工作包括语音丰富的句子的设计，用于在电话信道上记录语音数据的数据获取设置，用于在远程信息检索的背景下获取呼叫者与机器的口语对话的语音数据的Oz向导设置任务。给出了收集尽可能接近真实世界的语音数据时要注意的事项。给出了计划的当前状态以及为实现目标而计划采取的一系列行动。

著录项

来源
《Chinese Spoken Language Processing; Lecture Notes in Artificial Intelligence; 4274 》|2006年|792-801|共10页
会议地点 Singapore(SG)
作者
K. Samudravijaya;
展开▼
作者单位

Tata Institute of Fundamental Research, Homi Bhabha Road, Mumbai 400005, India;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类程序语言、算法语言 ;
关键词

相似文献

外文文献
中文文献
专利

1. Development of speech corpora for speaker recognition research and evaluation in Indian languages [J] . Hemant A. Patil, T.K. Basu International journal of speech technology . 2008 ,第1期

机译：语音语料库的开发，用于印度语中的说话人识别研究和评估
2. Emanuela Cresti and Massimo Moneglia (Eds.),C-ORAL-ROM. Integrated Reference Corpora for Spoken Romance Languages John Benjamins Publishing Co., Amsterdam, The Netherlands,2005, xvii, 299 pp + index (incl. DVD), ISBN 90-272-2286-x [J] . Thomas Roller Machine translation . 2006 ,第4期

机译：Emanuela Cresti和Massimo Moneglia（编辑），C-ORAL-ROM。口述浪漫语言综合参考语料库John Benjamins Publishing Co.，荷兰阿姆斯特丹，2005年，xvii，299 pp +索引（含DVD），ISBN 90-272-2286-x
3. Corpus-Based Translation Induction in Indian Languages Using Auxiliary Language Corpora from Wikipedia [J] . Tholpadi Goutham, Bhattacharyya Chiranjib, Shevade Shirish ACM transactions on Asian language information processing . 2017 ,第3期

机译：使用来自维基百科的辅助语言语料库基于语料库的印度语言翻译归纳
4. Development of Multi-lingual Spoken Corpora of Indian Languages [C] . K. Samudravijaya International Symposium on Chinese Spoken Language Processing . 2006

机译：印度语言多语言语言的发展
5. Interaction, authenticity and spoken corpora: Building teaching materials for adult English language learners. [D] . Cunningham, Courtney. 2010

机译：互动性，真实性和语料库：为成人英语学习者制作教材。
6. Self-ratings of Spoken Language Dominance: A Multi-Lingual Naming Test (MINT) and Preliminary Norms for Young and Aging Spanish-English Bilinguals [O] . Tamar H. Gollan, Gali H. Weissberger, Elin Runnqvist, -1

机译：口语级联的自我评级：一种多语言命名试验（薄荷）和年轻和老化西班牙语 - 英语双语的初步规范
7. EXPLORING THE POSSIBILITY OF APPLYING NATURAL LANGUAGE PROCESSING (NLP) IN INTER MULTI-LINGUAL TRANSLATION OF INDIAN LANGUAGES FOR ENHANCED EASE OF INTEROPERABILITY [O] . Vipul Goyal, Hardik Chaudhary 2020

机译：探讨在印度语言的多语言翻译中应用自然语言处理（NLP），以提高互操作性

Development of Multi-lingual Spoken Corpora of Indian Languages

摘要

著录项

相似文献

相关主题

期刊订阅