首页> 外文会议>International Workshop on Evaluation of Natural Language and Speech Tool for Italian >The Tanl Lemmatizer Enriched with a Sequence of Cascading Filters
【24h】

The Tanl Lemmatizer Enriched with a Sequence of Cascading Filters

机译:Tanl lemmatizer富含了一系列级联过滤器

获取原文

摘要

We have extended an existing lemmatizer, which relies on a lexicon of about 1.2 millions form, where lemmas are indexed by rich PoS tags, with a sequence of cascading filters, each one in charge of dealing with specific issues related to out-of-dictionary words. The last two filters are devoted to resolve semantic ambiguities between words of the same syntactic category, by querying external resources: an enriched index built on the Italian Wikipedia and the Google index.
机译:我们已经扩展了现有的lemmatizer,它依赖于大约1.2百万格式的词典,其中lemmas由富pos标签索引,其中一系列级联滤波器,每一个负责处理与字典外的特定问题。字。最后两个过滤器致力于通过查询外部资源来解决相同句法类别的单词之间的语义歧义:在意大利维基百科和Google索引上构建的丰富索引。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号