首页> 外文会议>Joint workshop on unsupervised and semi-supervised learning in NLP 2012 >Robust Induction of Parts-of-Speech in Child-Directed Language by Co-Clustering of Words and Contexts
【24h】

Robust Induction of Parts-of-Speech in Child-Directed Language by Co-Clustering of Words and Contexts

机译:通过单词和上下文的共同聚类,鲁棒地诱导儿童导向语言的词性

获取原文
获取原文并翻译 | 示例

摘要

We introduce Conflict-Driven Co-Clustering, a novel algorithm for data co-clustering, and apply it to the problem of inducing parts-of-speech in a corpus of child-directed spoken English. Co-clustering is preferable to unidimensional clustering as it takes into account both item and context ambiguity. We show that the categorization performance of the algorithm is comparable with the co-clustering algorithm of Leibbrandt and Powers (2008), but out-performs that algorithm in robustly pruning less-useful clusters and merging them into categories strongly corresponding to the three main open classes of English.
机译:我们介绍了一种基于冲突的协同聚类(一种新的数据协同聚类算法),并将其应用于在以儿童为导向的英语语料库中诱发词性的问题。共聚比一维聚类更可取,因为它考虑了项目和上下文的歧义。我们证明了该算法的分类性能可与Leibbrandt and Powers(2008)的共聚算法媲美,但在强力修剪较少用的聚类并将其归类为与三个主要开放类相对应的类别方面表现优于该算法英语课。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号