首页> 外文会议> >On using parametric string distances and vector quantization in designing syntactic pattern recognition systems
【24h】

On using parametric string distances and vector quantization in designing syntactic pattern recognition systems

机译:在设计句法模式识别系统中使用参数字符串距离和向量量化

获取原文

摘要

Considers a fundamental problem in syntactic pattern recognition in which we are required to recognize a string from its noisy version. We assume that the system has a dictionary which is a collection of all the ideal representations of the objects in question. When a noisy sample has to be processed, the system compares it with every element in the dictionary based on a nearest-neighbor philosophy. This is typically achieved using three standard edit operations-substitution, insertion and deletion. To accomplish this, one usually assigns a distance for the elementary symbol operations, d(.,.), and the inter-pattern distance, D(.,.), is computed as a function of these symbol edit distances. In this paper, we consider the assignment of the inter-symbol distances in terms of the novel and interesting assignments-the parametric distances-introduced by Bunke et al. (1993). We show how the classifier can be trained to get the optimal parametric distance using vector quantization in the meta-space, and report classification results after such a training process. In all our experiments, the training was typically achieved in a very few iterations. The subsequent classification accuracy we obtained using this single-parameter scheme was 96.13%. The power of the scheme is obvious if we compare it to 96.67%, which is the accuracy of the scheme which uses the complete array of inter-symbol distances derived from a knowledge of all the confusion probabilities.
机译:考虑句法模式识别中的一个基本问题,在该问题中,我们需要从嘈杂的版本中识别字符串。我们假设系统有一个字典,该字典是所讨论对象的所有理想表示形式的集合。当必须处理嘈杂的样本时,系统会根据最近邻原理将其与字典中的每个元素进行比较。通常使用三种标准编辑操作(替换,插入和删除)来实现。为此,通常为基本符号运算分配一个距离d(。,。),并根据这些符号编辑距离计算出图案间距离D(。,。)。在本文中,我们根据Bunke等人提出的新颖有趣的分配方法(参数距离)来考虑符号间距离的分配。 (1993)。我们展示了如何使用元空间中的矢量量化来训练分类器以获得最佳参数距离,并在这样的训练过程之后报告分类结果。在我们所有的实验中,训练通常都是在很少的迭代中完成的。使用该单参数方案获得的后续分类精度为96.13%。如果将其与96.67%进行比较,则该方案的功能显而易见,这是使用从所有混淆概率的知识中得出的符号间距离的完整数组的方案的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号