Annotated Amharic Corpora

机译：注释的Amharic Corpora.

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Amharic is one of under-resourced languages. The paper presents two text corpora. The first one is a substantially cleaned version of existing morphologically annotated WIC Corpus (210,000 words). The second one is the largest Amharic text corpus (17 million words). It was created from Web pages automatically crawled in 2013, 2015 and 2016. It is part-of-speech annotated by a tagger trained and evaluated on the WIC Corpus.

机译：Amharic是资源不足的语言之一。本文提出了两个文本语料库。第一个是现有的形态学注释的WIC语料库（210,000字）的基本清洁版。第二个是最大的Amharic文本语料库（1700万字）。它是从2013年，2013年，2016年自动爬网的网页创建的。它是由Tagger培训并在WIC语料库上进行评估的演讲。

著录项

来源
《International Conference on Text, Speech and Dialogue》|2016年|550p|共8页
会议地点
作者
Pavel Rychly; Vit Suchomel;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP391.1-53;
关键词

相似文献

外文文献
中文文献
专利

1. Automatic Generation of Amharic Math Word Problem and Equation [J] . Andinet Assefa Bekele 电脑和通信（英文） . 2020,第008期
2. Resistance and Representation in Amharic Folk Poetry （1889-1974） [J] . Melakneh Mengistu 文化与宗教研究：英文版 . 2018,第002期
3. Building and Annotating a Codeswitched Hate Speech Corpora [J] . Edward Ombui, Lawrence Muchemi, Peter Wagacha International Journal of Information Technology and Computer Science . 2021,第3期

机译：建设和注释代号讨厌语音语料库
4. Taming the Wild Etext: Managing, Annotating, and Sharing Tibetan Corpora in Open Spaces [J] . Trinley Ngawang, Tenzin, Schmidt Dirk, ACM transactions on Asian and low-resource language information processing . 2021,第2期

机译：驯服野外的etext：在开放空间中管理，注释和分享西藏语料
5. Argumentation in the 2016 US presidential elections: annotated corpora of television debates and social media reaction [J] . Jacky Visser, Barbara Konat, Rory Duthie, Language Resources and Evaluation . 2020,第1期

机译：2016年美国总统大选的争论：电视辩论和社交媒体反应的注解语料库
6. Annotated Amharic Corpora [C] . Pavel Rychly, Vit Suchomel International conference on text, speech and dialogue . 2016

机译：带注释的阿姆哈拉语语料库
7. The submorphemic structure of Amharic: Toward a phonosemantic analysis. [D] . Ayalew, Bezza Tesfaw. 2013

机译：阿姆哈拉语的亚语素结构：进行语音语义分析。
8. Pooling annotated corpora for clinical concept extraction [O] . Kavishwar B Wagholikar, Manabu Torii, Siddhartha R Jonnalagadda, 2013

机译：合并带注释的语料库以提取临床概念
9. Corpora Annotated with Negation: An Overview [O] . Salud María Jiménez-Zafra, Roser Morante, María Teresa Martín-Valdivia, 2020

机译：Corpora用否定注释：概述

Annotated Amharic Corpora

摘要

著录项

相似文献

相关主题

期刊订阅