首页> 外文会议>International Conference on Portugal for Natural Language Processing >Constructing Empirical Formulas for Testing Word Similarity by the Inductive Method of Model Self-Organization
【24h】

Constructing Empirical Formulas for Testing Word Similarity by the Inductive Method of Model Self-Organization

机译:构建模型自组织归纳方法检测单词相似性的经验公式

获取原文

摘要

Identification of words with the same base meaning is a necessary procedure for many algorithms of computational linguistics and text processing. We propose to use for this a knowledge-poor approach using an empirical formula based on the number of the coincident letters in the initial parts of the two words and the number of non-coincident letters in the final parts of these two words. To construct such a formula for a given language, we use inductive method of self-organization developed by A. Ivahnenko. This method considers a set of models (formulas) of a given class and selects the best ones using training samples and test samples. We give a detailed example for English. We also show how to apply the formula for creating word frequency list.
机译:识别具有相同基本含义的单词是许多计算语言学和文本处理的许多算法的必要过程。我们建议使用基于两个单词的初始部分中的重合字母的数量和这两个单词的最终部分中的非重合字母数的重合字母的数量来使用经验公式的知识差的方法。为了为给定语言构建这种公式,我们使用由Avahnenko开发的自组织的归纳方法。该方法考虑了一组给定类的模型(公式),并使用培训样本和测试样本选择最佳的模型。我们提供详细的英语示例。我们还展示了如何应用用于创建字频率列表的公式。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号