首页> 外文学位 >A computational analysis of information structure using parallel expository texts in English and Japanese.
【24h】

A computational analysis of information structure using parallel expository texts in English and Japanese.

机译:使用英语和日语的平行说明文字对信息结构进行计算分析。

获取原文
获取原文并翻译 | 示例

摘要

This thesis concerns the notion of 'information structure': informally, organization of information in an utterance with respect to the context. Information structure has been recognized as a critical element in a number of computer applications: e.g., selection of contextually appropriate forms in machine translation and speech generation, and analysis of text readability in computer-assisted writing systems.;One of the problems involved in these applications is how to identify information structure in extended texts. This problem is often ignored, assumed to be trivial, or reduced to a sub-problem that does not correspond to the complexity of realistic texts. A handful of computational proposals face the problem directly, but they are generally limited in coverage and all suffer from lack of evaluation. To fully demonstrate the usefulness of information structure, it is essential to apply a theory of information structure to the identification problem and to provide an evaluation method.;This thesis adopts a classic theory of information structure as binomial partition between theme and rheme, and captures the property of theme as a requirement of the contextual-link status. The notion of 'contextual link' is further specified in terms of discourse status, domain-specific knowledge, and linguistic marking. The relation between theme and rheme is identified as the semantic composition of the two, and linked to surface syntactic structure using Combinatory Categorial Grammar. The identification process can then be specified as analysis of contextual-link status along the linguistic structure.;The implemented system identifies information structure in real texts in English. Building on the analysis of Japanese presented in the thesis, the system automatically predicts contextually-appropriate use of certain particles in the corresponding texts in Japanese. The machine prediction is then compared with human translations. The evaluation results demonstrate that the prediction of the theory is an improvement over alternative hypotheses. We then conclude that information structure can in fact be used to improve the quality of computational applications in practical settings.
机译:本论文涉及“信息结构”的概念:非正式地,以关于环境的话语来组织信息。信息结构已被认为是许多计算机应用程序中的关键要素:例如,在机器翻译和语音生成中选择适合上下文的形式,以及分析计算机辅助书写系统中的文本可读性。应用程序是如何识别扩展文本中的信息结构。这个问题通常被忽略,被认为是微不足道的,或者被简化为与现实文本的复杂性不对应的子问题。少数计算建议直接面对该问题,但是它们通常覆盖范围有限,并且都缺乏评估。为了充分证明信息结构的有用性,有必要将信息结构理论应用于识别问题并提供一种评价方法。本论文采用经典的信息结构理论作为主题和押韵的二项式划分,并捕获主题的属性是上下文链接状态的要求。 “语境链接”的概念在话语状态,特定领域的知识和语言标记方面作了进一步规定。主题和押韵之间的关系被确定为两者的语义组成,并使用组合分类语法将其与表面句法结构联系起来。然后,可以将识别过程指定为对沿语言结构的上下文链接状态进行分析。;所实现的系统以英文实文本识别信息结构。该系统在对论文进行日语分析的基础上,自动预测了日语对应文本中某些粒子的上下文相关用法。然后将机器预测与人工翻译进行比较。评估结果表明,该理论的预测是对替代假设的改进。然后我们得出结论,实际上可以使用信息结构来提高实际设置中计算应用程序的质量。

著录项

  • 作者

    Komagata, Nobo Naonobu.;

  • 作者单位

    University of Pennsylvania.;

  • 授予单位 University of Pennsylvania.;
  • 学科 Language Linguistics.;Computer Science.
  • 学位 Ph.D.
  • 年度 1999
  • 页码 278 p.
  • 总页数 278
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号