Extending the MPC corpus to Chinese and Urdu - A Multiparty Multi-Lingual Chat Corpus for Modeling Social Phenomena in Language

机译：将MPC语料库扩展到中文和乌尔都语-用于建模语言中的社会现象的多方多语言聊天语料库

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we report our efforts in building a multi-lingual multi-party online chat corpus (MMPC) in order to develop a firm understanding in a set of social constructs such as agenda control, influence, and leadership as well as to computationally model such constructs in online interactions. These automated models will help capture the dialogue dynamics that are essential for developing, among others, realistic human-machine dialogue systems, including autonomous virtual chat agents. In this paper, we first introduce our experiment design and data collection method in Chinese and Urdu, and then report on the current stage of our data collection. We annotated the collected corpus on four levels: communication links, dialogue acts, local topics, and meso-topics. Results from the analyses of annotated data on different languages indicate some interesting phenomena, which are reported in this paper.

机译：在本文中，我们报告了我们在建立多语言多方在线聊天语料库（MMPC）方面所做的工作，以便在一系列社交结构（如议程控制，影响力和领导力以及对计算的理解）中建立牢固的理解。在在线互动中对此类构造进行建模。这些自动化模型将有助于捕获对话动态，这对于开发现实的人机对话系统（包括自治的虚拟聊天代理）至关重要。在本文中，我们首先介绍了中文和乌尔都语的实验设计和数据收集方法，然后报告了数据收集的当前阶段。我们在四个级别上注释了收集的语料库：交流链接，对话行为，本地主题和中观主题。对不同语言的带注释数据的分析结果表明了一些有趣的现象，本文对此进行了报道。

著录项

来源
《International conference on language resources and evaluation》|2012年|2868-2873|共6页
会议地点
作者
Ting Liu; Samira Shaikh; Tomek Strzalkowski; Aaron Broadwell; Jennifer Stromer-Galley; Sarah Taylor; Umit Boz; Xiaoai Ren; Jingsi Wu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Multi-lingual Multi-party online-chat; annotation; social phenomena; post-session survey;

机译：多语言多方在线聊天;注解;社会现象;会后调查;

相似文献

外文文献
中文文献
专利

1. Multimodal corpus of multiparty conversations in L1 and L2 languages and findings obtained from it [J] . Yamamoto Seiichi, Taguchi Keiko, Ijuin Koki, Language Resources and Evaluation . 2015,第4期

机译：L1和L2语言的多方对话的多模式语料库以及从中获得的发现
2. Towards building a Urdu Language Corpus using Common Crawl [J] . Shafiq Hafiz Muhammad, Tahir Bilal, Mehmood Muhammad Amir Journal of intelligent & fuzzy systems: Applications in Engineering and Technology . 2020,第2Pta2期

机译：使用常见爬网构建乌尔都语语言语料库
3. CLEU - A Cross-Language English-Urdu Corpus and Benchmark for Text Reuse Experiments [J] . Muneer Iqra, Sharjeel Muhammad, Iqbal Muntaha, Journal of the American Society for Information Science and Technology . 2019,第7期

机译：CLEU-跨语言的英语-乌尔都语语料库和文本重用实验的基准
4. Extending the MPC corpus to Chinese and Urdu - A Multiparty Multi-Lingual Chat Corpus for Modeling Social Phenomena in Language [C] . Ting Liu, Samira Shaikh, Tomek Strzalkowski, International conference on language resources and evaluation . 2012

机译：将MPC语料库扩展到中文和Urdu - 一种用于在语言中建模社会现象的多氏多语言聊天语料库
5. A historical and sociolinguistic approach to language change in Mandarin Chinese: Corpus evidence for the development of YOU-MEI-YOU. [D] . Li, Wenfeng. 2016

机译：普通话语言转换的历史和社会语言学方法：YOU-MEI-YOU发展的语料库证据。
6. A corpus for mining drug-related knowledge from Twitter chatter: Language models and their utilities [O] . Abeed Sarker, Graciela Gonzalez 2017

机译：一个从Twitter聊天中挖掘与毒品有关的知识的语料库：语言模型及其实用程序
7. Corpus applications in the teaching of Chinese Language(2) : The Construction and the application of Chinese corpus that supports the changeable vocabulary education [O] . 砂岡和子 2005

机译：语料库在汉语教学中的应用（二）：支持可变词汇教学的汉语语料库的构建和应用

Extending the MPC corpus to Chinese and Urdu - A Multiparty Multi-Lingual Chat Corpus for Modeling Social Phenomena in Language

摘要

著录项

相似文献

相关主题

期刊订阅