Exploration of a Text Collection and Identification of Topics by Clustering

机译：通过聚类探索文本收集和主题识别

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

An application of cluster analysis to identify topics in a collection of posters abstracts from the Society for Neuroscience (SfN) Annual Meeting in 2006 is presented. The topics were identified by selecting from the abstracts belonging to each cluster the terms with the highest scores using different ranking schemes. The ranking scheme based on log-entropy showed better performance in this task than other more classical TFIDF schemes. An evaluation of the extracted topics was performed by comparison with previously defined thematic categories for which titles are available, and after assigning each cluster to one dominant category. The results show that repeated bisecting k-means performs better than standard k-means.

机译：提出了聚类分析在识别神经科学协会（SfN）2006年年会海报摘要的主题中的应用。通过使用不同的排名方案从属于每个类的摘要中选择得分最高的术语来确定主题。与其他更经典的TFIDF方案相比，基于对数熵的排序方案在此任务中显示出更好的性能。通过与先前定义的主题类别进行比较，对提取的主题进行评估，然后将每个聚类分配给一个主要类别。结果表明，重复平分k均值的效果优于标准k均值。

著录项

来源
《International Conference on Intelligent Data Engineering and Automated Learing(IDEAL 2007); 20071216-19; Birmingham(GB)》|2007年|P.115124|共2页
会议地点 Birmingham(GB)
作者
Antoine Naud; Shiro Usui;
展开▼
作者单位

RIKEN Brain Science Institute;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类人工智能理论;
关键词

相似文献

外文文献
中文文献
专利

1. Exploration of a collection of documents in neuroscience and extraction of topics by clustering. [J] . Naud A, Usui S Neural Networks: The Official Journal of the International Neural Network Society . 2008,第8期

机译：探索神经科学文献集并通过聚类提取主题。
2. Research on Multiple Layer Text Topics Identification Algorithm Based on the Dynamic Diverse Thresholds Clustering [J] . Yong-Dong Xu, Ting-Bin Zhang, Guang-Ri Quan, Advanced Science Letters . 2012,第Null期

机译：基于动态多样性阈值聚类的多层文本主题识别算法研究
3. Additive Regularization for Topic Models of Text Collections [J] . K. V. Vorontsov Doklady. Mathematics . 2014,第3期

机译：文本集合主题模型的加性正则化
4. Exploration of a Text Collection and Identification of Topics by Clustering [C] . Antoine Naud, Shiro Usui International Conference on Intelligent Data Engineering and Automated Learing . 2007

机译：通过聚类探索文本收集和主题识别
5. Opinion topic, holder and polarity in texts: Exploration and automatic identification from cross-lingual data [D] . Kim, Kyoung-Young. 2011

机译：意见主题，文本的持有人和极性：跨语言数据的探索和自动识别
6. Health-Related Hot Topic Detection in Online Communities Using Text Clustering [O] . Yingjie Lu, Pengzhu Zhang, Jingfang Liu, 2010

机译：使用文本聚类的在线社区中与健康相关的热门话题检测
7. Shallow Text Clustering Does Not Mean Weak Topics: How Topic Identification Can Leverage Bigram Features [O] . Velcin Julien, Roche Mathieu, Poncelet Pascal 2016

机译：浅层文本聚类并不意味着主题弱：主题识别如何利用Bigram功能

Exploration of a Text Collection and Identification of Topics by Clustering

摘要

著录项

相似文献

相关主题

期刊订阅