首页> 外文期刊>ACM transactions on Asian and low-resource language information processing >Plan Optimization to Bilingual Dictionary Induction for Low-resource Language Families
【24h】

Plan Optimization to Bilingual Dictionary Induction for Low-resource Language Families

机译:计划优化低资源语言系列的双语词典归纳

获取原文
获取原文并翻译 | 示例
           

摘要

Creating bilingual dictionary is the first crucial step in enriching low-resource languages. Especially for the closely related ones, it has been shown that the constraint-based approach is useful for inducing bilingual lexicons from two bilingual dictionaries via the pivot language. However, if there are no available machine-readable dictionaries as input, we need to consider manual creation by bilingual native speakers. To reach a goal of comprehensively create multiple bilingual dictionaries, even if we already have several existing machine-readable bilingual dictionaries, it is still difficult to determine the execution order of the constraint-based approach to reducing the total cost. Plan optimization is crucial in composing the order of bilingual dictionaries creation with the consideration of the methods and their costs. We formalize the plan optimization for creating bilingual dictionaries by utilizing Markov Decision Process (MDP) with the goal to get a more accurate estimation of the most feasible optimal plan with the least total cost before fully implementing the constraint-based bilingual lexicon induction. We model a prior beta distribution of bilingual lexicon induction precision with language similarity and polysemy of the topology as alpha and beta parameters. It is further used to model cost function and state transition probability. We estimated the cost of all investment plans as a baseline for evaluating the proposed MDP-based approach with total cost as an evaluation metric. After utilizing the posterior beta distribution in the first batch of experiments to construct the prior beta distribution in the second batch of experiments, the result shows 61.5% of cost reduction compared to the estimated all investment plans and 39.4% of cost reduction compared to the estimated MDP optimal plan. The MDP-based proposal outperformed the baseline on the total cost.
机译:创建双语词典是丰富低资源语言的第一个至关重要的步骤。特别是对于密切相关的,已经表明基于约束的方法可用于通过枢轴语言从两个双语词典中诱导双语词典。但是,如果没有可用的机器可读词典作为输入,我们需要考虑通过双语母语扬声器进行手动创建。为了达到全面创建多种双语词典的目标,即使我们已经拥有了几种现有的机器可读双语词典,仍然难以确定基于约束的方法的执行顺序来降低总成本。计划优化对于在考虑方法及其成本的情况下构成创建双语词典的顺序至关重要。我们通过利用马尔可夫决策过程(MDP)来创建双语词典的计划优化,以获得更准确地估计最可行的最佳计划,在充分实现基于约束的双语词典诱导之前具有最小的总成本。我们使用拓扑语言相似性和拓扑的拓扑和β参数来模拟一个先前的双语词典诱导精度。它还用于建模成本函数和状态转换概率。我们估计所有投资计划的成本作为评估拟议的基于MDP的方法的基线,以总成本为评估指标。在第一批实验中利用后β分布构建在第二批实验中的先前β分布后,结果表明,与估计的所有投资计划相比,减少成本降低的61.5%,与估计相比为39.4%的成本降低MDP最优计划。基于MDP的建议表现出总成本的基线。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号