首页> 外文会议>IEEE International Conference on Information Visualisation >Visualization of long distance grammatical collocation patterns in language
【24h】

Visualization of long distance grammatical collocation patterns in language

机译:语言中长距离语法搭配模式的可视化

获取原文

摘要

Research in generic unsupervised learning of language structure applied to the Search for Extra-Terrestrial Intelligence (SETI) and decipherment of unknown languages has sought to build up a generic picture of lexical and structural patterns characteristic of natural language. As part of this toolkit a generic system is required to facilitate the analysis of behavioral trends amongst selected pairs of terminals and non-terminals alike, regardless of which target natural language was selected. Such a tool may be useful in other areas, such a lexico- grammatical analysis or tagging of corpora. Data-oriented approaches to corpus annotation use statistical n-grams and/or constraint-based models; n-grams or constraints with wider windows can improve error-rates, by examining the topology of the annotation-combination space. We present a visualization tool to help linguists find "useful" PoS-tag combinations, and cohesion between linguistic annotations at other levels; and suggest some possible applications.
机译:普通无监督学习的语言结构的研究应用于寻求陆地智力(SETI)和未知语言的破译,并寻求建立自然语言的词汇和结构模式的通用图片。作为该工具包的一部分,需要通用系统来促进所选择的终端和非终端相似的行为趋势的分析,无论选择哪种目标自然语言都被选中。这种工具可以在其他领域中有用,如此的词典分析或基础标记。对语料库注释的数据导向方法使用统计n-gram和/或基于约束的模型;通过检查注释组合空间的拓扑,可以通过更宽的窗口的n-grams或限制可以提高误差率。我们提出了一种可视化工具来帮助语言学家找到“有用的”POS标签组合,以及其他级别的语言注释之间的凝聚力;并提出一些可能的应用程序。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号