XML Rules for Enclitic Segmentation

机译：用于气候分段的XML规则

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Sentence word segmentation is a very complex and important task in almost all natural language processing applications. Several works conceal or obviate the difficulties evolved in this process. In some cases, they adopt an easy partial solution acceptable for certain languages and applications, and, in others, they rely on a later or previous phase for solving it. However, there are hardly any papers with explanations describing how this later or previous phases have to be done.In this paper we have described these problems, focusing on part-of-speech tagging tasks, and propose a solution for one of them: the segmentation of verbal forms which contain enclitic pronouns. We have presented a generic verb processing system, which segments and pretags verbs which have enclitic pronouns joined to them.As we have seen, the system does not limit its function to segmentation, since it pretags the different linguistic components of a verbal form with enclitics, and removes invalid tags for its context. This innovative issue will be useful forpart-of-speech taggers, which can use this information to avoid making certain errors, thus improving its results.Although we have applied it to the Galician language, it can be easily adapted to other romance languages. The generic rule system we have designed allows rules to be written on the basis of XML files. This, combined with the use of lexicons, makes this adaptation simple and independent of the system internals.

机译：在几乎所有自然语言处理应用程序中，句子分词都是一项非常复杂且重要的任务。几件作品可以掩盖或消除这一过程中产生的困难。在某些情况下，他们采用了某些语言和应用程序可接受的简单的局部解决方案，而在另一些情况下，他们则依赖于上一个或下一个阶段来解决它。但是，几乎没有任何论文有说明如何描述此后期或之前的阶段必须完成的论文。在本文中，我们已经描述了这些问题，重点关注词性标记任务，并为其中一个提出了解决方案：包含环境代词的言语形式的分割。我们提供了一个通用的动词处理系统，该系统将加入了代词的动词进行分段和预标记动词，正如我们已经看到的那样，该系统不仅将其功能限制在分段上，因为它预先标记了带有言语的语言形式的不同语言成分，并为其上下文删除无效的标记。这个创新性的问题对于词性标注者而言非常有用，它可以利用该信息避免犯某些错误，从而改善其结果。尽管我们已将其应用于加利西亚语，但可以轻松地将其应用于其他浪漫语言。我们设计的通用规则系统允许基于XML文件编写规则。这与词典的使用相结合，使这种改编变得简单且独立于系统内部。

著录项

来源
《International Conference on Computer Aided Systems Theory(EUROCAST 2007); 20070212-16; Las Palmas de Gran Canaria(ES)》|2007年|P.273-281|共9页
会议地点 Las Palmas de Gran Canaria(ES)
作者
Fco. Mario Barcala; Miguel A. Molinero; Eva Dominguez;
展开▼
作者单位

Centro Ramon Pineiro, Ctra. Santiago-Noia km. 3, A Barcia, 15896 Santiago de Compostela, Spain;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类机器辅助技术;
关键词

相似文献

外文文献
中文文献
专利

1. The complete O ( α s 2 ) non-singlet heavy flavor corrections to the structure functions g 1 , 2 e p ( x , Q 2 ) , F 1 , 2 , L e p ( x , Q 2 ) , F 1 , 2 , 3 ν ( ν ˉ ) ( x , Q 2 ) and the associated sum rules [J] . Johannes Blümlein, Giulio Falcioni, Abilio De Freitas Nuclear physics, B . 2016,第1期

机译：完整的 O （ α s 2 ）对结构函数 g 1 ， 2 < / mml：mrow> e p （ x ， Q 2 ）， F 1 ， < mml：mn> 2 ， L e p （ < mml：mi> x ， Q 2 ）， F 1 ， 2 ， 3 ν （ ν ˉ ）（ x ， Q 2 ）和相关的求和规则
2. Lattice QCD and QCD sum rule determination of the decay constants of η c , J / ψ and h c states [J] . Damir Be?irevi?, Goran Duplan?i?, Bruno Klajn, Nuclear physics, B . 2014,第11期

机译： η c ， J / ψ 和 h c 状态
3. Analysis of the X ( 4350 ) as a scalar c ˉ c and D s * D ˉ s * mixing state with QCD sum rules [J] . Zhi-Gang Wang Physics letters . 2010,第4期

机译： X （ 4350 ）作为标量 c ˉ c 和 D s * < / mml：msubsup> D ˉ s * 混合状态与QCD和规则
4. XML Rules for Enclitic Segmentation [C] . Fco. Mario Barcala, Miguel A. Molinero, Eva Dominguez International Conference on Computer Aided Systems Theory . 2007

机译：XML条形分段规则
5. Rule-based application integration using XML. [D] . Zhang, Huaxin. 2001

机译：使用XML的基于规则的应用程序集成。
6. Multi-arrhythmias detection with an XML rule-based system from 12-Lead Electrocardiogram [O] . Abdeldjalil Khelassi, Sarra-Nassira Yelles-chaouche, Faiza Benais 2017

机译：基于12导心电图的基于XML规则的系统进行的多心律失常检测
7. XML rules for enclitic segmentation [O] . Fco Mario Barcala, Miguel A. Molinero, Eva Domínguez 2006

机译：用于封闭分段的XmL规则

XML Rules for Enclitic Segmentation

摘要

著录项

相似文献

相关主题

期刊订阅