USING MULTISET DISCRIMINATION TO SOLVE LANGUAGE PROCESSING PROBLEMS WITHOUT HASHING

Cai JZ.; Paige R.

首页> 外文期刊>Theoretical computer science >USING MULTISET DISCRIMINATION TO SOLVE LANGUAGE PROCESSING PROBLEMS WITHOUT HASHING

【24h】

USING MULTISET DISCRIMINATION TO SOLVE LANGUAGE PROCESSING PROBLEMS WITHOUT HASHING

机译：使用多集区分解决不散列的语言处理问题

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

It is generally assumed that hashing is essential to solve many language processing problems efficiently; e.g. symbol table formation and maintenance, grammar manipulation, basic block optimization, and global optimization. This paper questions this assumption, and initiates development of an efficient alternative compiler methodology without hashing or sorting. The methodology rests on efficient solutions to the basic problem of detecting duplicate values in a multiset, which we call multiset discrimination. Paige and Tarjan (1987) gave an efficient solution to multiset discrimination for detecting duplicate elements occurring in a multiset of varying length strings. The technique was used to develop an improved algorithm for lexicographic sorting, whose importance stems largely from its use in solving a variety of isomorphism problems (Aho et al., 1974), The current paper and a related paper (Paige, 1994) show that full lexicographic sorting is not needed to solve these isomorphism problems, because they can be solved more efficiently using straightforward extensions to the simpler multiset discrimination technique. By reformulating language processing problems in terms of multiset discrimination, we also show how almost every subtask of compilation can be solved without hashing in worst case running time no worse (and frequently better) than the best previous expected time solution (under the assumption that one hash operation takes unit expected time), Because of their simplicity, our solutions may be of practical as well as theoretical interest. The various applications presented culminate with a new algorithm to solve iterated strength reduction folded with useless code elimination that runs in worst case asymptotic time and auxiliary space Theta(/L/ + /L'/), where /L/ and /L'/ represent the lengths of the initial and optimized programs, respectively. The previous best solution due to Cocke and Kennedy (1977) takes Omega(/L/(3)/L'/) has operations in the worst case. [References: 41]

机译：通常认为，散列对于有效解决许多语言处理问题至关重要。例如符号表的形成和维护，语法操作，基本块优化和全局优化。本文质疑这一假设，并着手开发一种无需哈希或排序的高效替代编译器方法。该方法基于有效的解决方案，以解决检测多集中重复值的基本问题，我们称之为多集区分。 Paige和Tarjan（1987）为多集判别提供了一种有效的解决方案，用于检测出现在可变长度字符串的多集中的重复元素。该技术被用于开发一种改进的词典编目排序算法，其重要性很大程度上源于其在解决各种同构问题中的用途（Aho等人，1974年），当前论文和相关论文（Paige，1994年）表明：不需要完整的字典分类法来解决这些同构问题，因为可以使用简单的多集判别技术的直接扩展来更有效地解决它们。通过以多集区分的方式重新表达语言处理问题，我们还展示了几乎所有的编译子任务都可以在不进行散列的情况下解决，而在最坏的情况下运行时间不会比以前的最佳预期时间解决方案差（并且常常更好）（假设哈希运算需要花费单位预期时间），由于其简单性，我们的解决方案可能具有实际意义和理论意义。提出的各种应用最终以一种新算法解决了迭代强度降低的问题，该算法通过无用代码消除折叠起来，在最坏情况下的渐近时间和辅助空间Theta（/ L / + / L'/），其中/ L /和/ L'/分别代表初始程序和优化程序的长度。由于Cocke和Kennedy（1977）提出的先前最佳解决方案采用了Omega（/ L /（3）/ L'/）在最坏的情况下运行。 [参考：41]

著录项

来源
《Theoretical computer science》 |1995年第2期|共40页
作者
Cai JZ.; Paige R.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类一般性问题;数学;
关键词
Linear time; Algorithms;

机译：线性时间;算法;

相似文献

外文文献
中文文献
专利

1. USING MULTISET DISCRIMINATION TO SOLVE LANGUAGE PROCESSING PROBLEMS WITHOUT HASHING [J] . Cai JZ., Paige R. Theoretical computer science . 1995,第1a2期

机译：使用多集区分解决不散列的语言处理问题
2. Checks and Balances Constraint Solving without Surprises in Object-Constraint Programming Languages [J] . Felgentreff Tim, Millstein Todd, Borning Alan, ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages . 2015,第10期

机译：在对象约束编程语言中检查和平衡约束求解而不会感到惊讶
3. Checks and Balances Constraint Solving without Surprises in Object-Constraint Programming Languages [J] . Felgentreff Tim, Millstein Todd, Borning Alan, ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages . 2015,第10期

机译：在对象约束编程语言中检查和平衡约束求解而不会感到惊讶
4. Optimally Indifferentiable Double-Block-Length Hashing Without Post-processing and with Support for Longer Key Than Single Block [C] . Yusuke Naito International Conference on Cryptology and Information Security in Latin America . 2019

机译：最佳不可区分的双块长度散列，无需后处理，并且支持比单个块更长的密钥
5. Talker discrimination in preschool children with and without specific language impairment. [D] . Dailey, Natalie S. 2013

机译：有或没有特定语言障碍的学龄前儿童的说话人歧视。
6. Negation’s Not Solved: Generalizability Versus Optimizability in Clinical Natural Language Processing [O] . Stephen Wu, Timothy Miller, James Masanz, -1

机译：否定的解决方案：临床自然语言处理中的可概括性与可优化性
7. Using multiset discrimination to solve language processing problems without hashing [O] . Cai Jiazhen, Paige Robert 1995

机译：使用多集区分来解决语言处理问题而无需哈希

USING MULTISET DISCRIMINATION TO SOLVE LANGUAGE PROCESSING PROBLEMS WITHOUT HASHING

摘要

著录项

相似文献

相关主题

期刊订阅