首页> 外文会议>Language and Technology Conference >Multilingual Tokenization and Part-of-speech Tagging. Lightweight Versus Heavyweight Algorithms

【24h】

Multilingual Tokenization and Part-of-speech Tagging. Lightweight Versus Heavyweight Algorithms

机译：多语言标记和言语分组标记。轻量级与重量级算法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This work focuses on morphological analysis of raw text and provides a recipe for tokenization, sentence splitting and part-of-speech tagging for all languages included in the Universal Dependencies Corpus. Scalability is an important issue when dealing with large-sized multilingual corpora. The experiments include both lightweight classifiers (linear and decision trees) and heavyweight LSTM-based architectures which are able to attain state-of-the-art results. All the experiments are carried out using the provided data "as-is". We apply lightweight and heavyweight classifiers on 5 distinct tasks, on multiple languages; we present some lessons learned during the training process; we look at perlanguage results as well as task averages, we present model footprints, and finally draw a few conclusions regarding trade-offs between the classifiers' characteristics.

机译：这项工作侧重于对原始文本的形态分析，并为通用依赖性语料库中包含的所有语言提供令牌化，句子分割和语音标记的配方。可扩展性是处理大型多语种语言的重要问题。实验包括轻量级分类器（线性和决策树）和基于重量的LSTM的架构，能够实现最先进的结果。所有实验都是使用所提供的数据“原样”进行的。我们在多种语言上涂抹于5个不同的任务的轻量级和重量级分类器;我们在培训过程中提出了一些经验教训;我们看看Perranguage结果以及任务平均值，我们呈现了模型脚印，最后在分类器的特征之间借鉴了一些关于权衡的结论。

著录项

来源
《Language and Technology Conference》|2018年|446p|共15页
会议地点
作者
Tiberiu Boros; Stefan Daniel Dumitrescu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词
Linear models; Neural networks; Long-Short-Term-Memory (LSTM) networks; Decision trees; Sequence labeling; Part-of-speech tagging; Morphological attributes; Tokenization; Sentence splitting;

机译：线性模型;神经网络;长期记忆（LSTM）网络;决策树;序列标记;术语标签;形态属性;象牙;句子分裂;

相似文献

外文文献
中文文献
专利

1. Does size matter? ?￠???? Thermoregulation of ?￠????heavyweight?￠???? and ?￠????lightweight?￠???? wasps (Vespa crabro and Vespula sp.) Does size matter? ?￠???? Thermoregulation of ?￠????heavyweight?￠???? and ?￠????lightweight?￠???? wasps (Vespa crabro and Vespula sp.) Does size matter? ?￠???? Thermoregulation of ?￠????heavyweight?￠???? and ?￠????lightweight?￠???? wasps (Vespa crabro and Vespula sp.) [J] . Anton Stabentheiner, Helmut Kovac Biology Open . 2012,第9期

机译：大小重要吗？？￠ ????重量的温度调节和？￠ ???? lightweight？￠ ????黄蜂（大黄蜂和小黄蜂sp。）大小重要吗？？￠ ????体重的体温调节和？￠ ???? lightweight？￠ ??????黄蜂（大黄蜂和小黄蜂sp。）大小重要吗？？￠ ????体重的体温调节和？￠ ???? lightweight？￠ ??????黄蜂（大黄蜂和小黄蜂sp。）
2. Lightweight Versus Heavyweight Mesh in Laparoscopic Inguinal Hernia Repair: An Updated Systematic Review and Meta-Analysis of Randomized Trials [J] . Hu Dan, Huang Bin, Gao Lili Journal of laparoendoscopic and advanced surgical techniques, Part A . 2019,第9期

机译：轻量级与重量级网格在腹腔镜腹股沟疝修补：随机试验的更新系统评论和荟萃分析
3. BONE MINERAL DENSITY AND BODY COMPOSITION AMONG ATHLETES: LIGHTWEIGHT VERSUS HEAVYWEIGHT SPORTS [J] . van Santen J. A., Amorim T. A., Sanchez-Santos M. T., Osteoporosis international: a journal established as result of cooperation between the European Foundation for Osteoporosis and the National Osteoporosis Foundation of the USA . 2017,第Suppla1期

机译：运动员中骨密度和身体组成：轻量级与重量级运动
4. Multilingual Tokenization and Part-of-speech Tagging. Lightweight Versus Heavyweight Algorithms [C] . Tiberiu Boros, Stefan Daniel Dumitrescu Language and Technology Conference . 2018

机译：多语言标记和言语分组标记。轻量级与重量级算法
5. DEVELOPMENT OF MECHANISTIC FLEXIBLE PAVEMENT DESIGN CONCEPTS FOR THE HEAVYWEIGHT F-15 AIRCRAFT (AIRFIELD, COMPUTER DESIGN, CBR, ALGORITHMS). [D] . KELLY, HENRY FRANCIS, IV. 1986

机译：重型F-15飞机的机械柔性路面设计概念的开发（飞机场，计算机设计，CBR，算法）。
6. Lightweight versus Heavyweight mesh for laparoscopic repair of inguinal hernia [O] . Muhammad Sajid, Catherine Leaver, Parv Sains, -1

机译：轻型与重型网片腹腔镜修复腹股沟疝
7. A prospective randomised controlled trial comparing chronic groin pain and quality of life in lightweight versus heavyweight polypropylene mesh in laparoscopic inguinal hernia repair [O] . Pradeep Prakash, Virinder Kumar Bansal, Mahesh Chandra Misra, 2016

机译：一项前瞻性随机对照试验，比较腹腔镜腹股沟疝修补术中轻型与重型聚丙烯网状结构的慢性腹股沟疼痛和生活质量

Multilingual Tokenization and Part-of-speech Tagging. Lightweight Versus Heavyweight Algorithms

摘要

著录项

相似文献

相关主题

期刊订阅