首页> 外文会议>Human language technology >Extracting Constraints on Word Usage from Large Text Corpora
【24h】

Extracting Constraints on Word Usage from Large Text Corpora

机译:从大文本语料库中提取单词用法约束

获取原文
获取原文并翻译 | 示例

摘要

Our research focuses on the identification of word usage constraints from large text corpora. Such constraints are useful both for the problem of selecting vocabulary for language generation and for disambiguating lexical meaning in interpretation. We are developing systems that can automatically extract such constraints from corpora and empirical methods for analyzing text. Identified constraints will be represented in a lexicon that will be tested computationally as part of a natural language system. We are also identifying lexical constraints for machine translation using the aligned Hansard corpus as training data and are identifying many-to-many word alignments.
机译:我们的研究重点是从大型文本语料库中识别单词使用限制。这样的约束对于选择用于语言生成的词汇表的问题以及在解释中消除词汇意义的歧义都是有用的。我们正在开发的系统可以自动从语料库和经验方法中提取此类约束以分析文本。识别出的约束将在词典中表示,该词典将作为自然语言系统的一部分进行计算测试。我们还使用对齐的Hansard语料库作为训练数据来识别机器翻译的词法约束,并识别出多对多的单词对齐方式。

著录项

  • 来源
    《Human language technology》|1994年|452-452|共1页
  • 会议地点 Plainsboro NJ(US)
  • 作者单位

    Department of Computer Science 450 Computer Science Building Columbia University;

    Department of Computer Science 450 Computer Science Building Columbia University;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 计算机软件;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号