首页> 外文会议>Human language technology >Extracting Constraints on Word Usage from Large Text Corpora

【24h】

Extracting Constraints on Word Usage from Large Text Corpora

机译：从大文本语料库中提取单词用法约束

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Our research focuses on the identification of word usage constraints from large text corpora. Such constraints are useful both for the problem of selecting vocabulary for language generation and for disambiguating lexical meaning in interpretation. We are developing systems that can automatically extract such constraints from corpora and empirical methods for analyzing text. Identified constraints will be represented in a lexicon that will be tested computationally as part of a natural language system. We are also identifying lexical constraints for machine translation using the aligned Hansard corpus as training data and are identifying many-to-many word alignments.

机译：我们的研究重点是从大型文本语料库中识别单词使用限制。这样的约束对于选择用于语言生成的词汇表的问题以及在解释中消除词汇意义的歧义都是有用的。我们正在开发的系统可以自动从语料库和经验方法中提取此类约束以分析文本。识别出的约束将在词典中表示，该词典将作为自然语言系统的一部分进行计算测试。我们还使用对齐的Hansard语料库作为训练数据来识别机器翻译的词法约束，并识别出多对多的单词对齐方式。

著录项

来源
《Human language technology》|1994年|452-452|共1页
会议地点 Plainsboro NJ(US)
作者
Kathleen McKeown; Rebecca Passonneau;
展开▼
作者单位

Department of Computer Science 450 Computer Science Building Columbia University;

Department of Computer Science 450 Computer Science Building Columbia University;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算机软件;
关键词

相似文献

外文文献
中文文献
专利

1. Empiric Studies of Word Use in Heterogeneous Text Corpora [J] . Patrick Krause Fortschritt-Berichte VDI . 2015,第842期

机译：异构文本语料库中单词使用的经验研究
2. Translating medical terminologies through word alignment in parallel text corpora. [J] . Deleger L, Merkel M, Zweigenbaum P Journal of biomedical informatics. . 2009,第4期

机译：通过并行文本语料库中的单词对齐来翻译医学术语。
3. EXTRACT OF JAPANESE TEXT CHARACTERISTICS OF SIMPLIFIED CORPORA USING NON-NEGATIVE MATRIX FACTORIZATION [J] . KOJI WAJIMA, KEI KOGURE, TOSHIHIRO FURUKAWA, Journal of Data Intelligence . 2020,第1期

机译：基于非负矩阵分解的简化企业日语文本特征提取
4. Extracting Constraints on Word Usage from Large Text Corpora [C] . Human language technology workshop . 1994

机译：从大型文本语料库中提取对单词使用的约束
5. Methods for Improving Natural Language Processing Techniques with Linguistic Regularities Extracted from Large Unlabeled Text Corpora [D] . Lucas, Michael Ryan. 2019

机译：提高了大型未标记文本语料库语言规律的自然语言处理技术的方法
6. Chronological corpora curve clustering: From scientific corpora construction to knowledge dynamics discovery through word life-cycles clustering [O] . Matilde Trevisani, Arjuna Tuzzi 2018

机译：时序语料库曲线聚类：从科学语料库构建到通过单词生命周期聚类的知识动力学发现
7. Extracting constraints on word usage from large text corpora [O] . Kathleen McKeown, Rebecca Passonneau 1993

机译：从大型文本语料库中提取对单词使用的约束

Extracting Constraints on Word Usage from Large Text Corpora

摘要

著录项

相似文献

相关主题

期刊订阅