Concept Discovery from Text

机译：从文本中发现概念

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Broad-coverage lexical resources such as WordNet are extremely useful. However, they often include many rare senses while missing domain-specific senses. We present a clustering algorithm called CBC (Clustering By Committee) that automatically discovers concepts from text. It initially discovers a set of tight clusters called committees that are well scattered in the similarity space. The centroid of the members of a committee is used as the feature vector of the cluster. We proceed by assigning elements to their most similar cluster. Evaluating cluster quality has always been a difficult task. We present a new evaluation methodology that is based on the editing distance between output clusters and classes extracted from WordNet (the answer key). Our experiments show that CBC outperforms several well-known clustering algorithms in cluster quality.

机译：诸如WordNet之类的广泛词汇资源非常有用。但是，它们通常包含许多罕见的感觉，而缺少特定领域的感觉。我们提出了一种称为CBC（按委员会进行聚类）的聚类算法，该算法可自动从文本中发现概念。它最初发现了一组紧密的簇，称为委员会，它们很好地分散在相似性空间中。委员会成员的质心用作聚类的特征向量。我们首先将元素分配给它们最相似的群集。评估集群质量一直是一项艰巨的任务。我们提出了一种新的评估方法，该方法基于输出群集和从WordNet（答案键）提取的类之间的编辑距离。我们的实验表明，CBC在聚类质量方面优于几种著名的聚类算法。

著录项

来源
《19th International Conference on Computational Linguistics Coling 2002 Vol.1 Aug 26-30, 2002 Taipei, Taiwan》|2002年|p.577-583|共7页
会议地点 Taipei Taiwan
作者
Dekang Lin; Patrick Pantel;
展开▼
作者单位

Department of Computing Science University of Alberta Edmonton, Alberta, Canada, T6G 2E8;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. BioTextQuest: a web-based biomedical text mining suite for concept discovery [J] . Nikolas Papanikolaou, Evangelos Pafilis, Stavros Nikolaou, Bioinformatics . 2011,第23期

机译：BioTextQuest：用于概念发现的基于Web的生物医学文本挖掘套件
2. BioTextQuest: a web-based biomedical text mining suite for concept discovery [J] . Vasilis J. Promponas Bioinformatics . 2011,第23期

机译：BioTextQuest：用于概念发现的基于Web的生物医学文本挖掘套件
3. Conceptual biology, hypothesis discovery, and text mining: Swanson's legacy [J] . Tanja Bekhuis Biomedical digital libraries . 2006,第1期

机译：概念生物学，假设发现和文本挖掘：Swanson的遗产
4. Automatic Discovery of Concepts from Text [C] . Ong Siou Chin, Narayanan Kulathuramaiyer, Alvin W. Yeo, IEEE/WIC/ACM International Conference on Web Intelligence . 2006

机译：从文本自动发现概念
5. Automatic concept organization: Organizing concepts from text through probability of co-occurrence analysis (POCA). [D] . Wu, Yi-Fang Brook. 2001

机译：自动概念组织：从文本到同时发生概率分析（POCA）来组织概念。
6. Conceptual biology hypothesis discovery and text mining: Swansons legacy [O] . Tanja Bekhuis 2006

机译：概念生物学假设发现和文本挖掘：Swanson的遗产
7. Concept Mining and Inner Relationship Discovery from Text [O] . Jiayu Zhou, Shi Wang 2016

机译：概念挖掘与文本的内在关系发现

Concept Discovery from Text

摘要

著录项

相似文献

相关主题

期刊订阅