Key word extraction for short text via word2vec, doc2vec, and textrank

Jun LI; Guimin HUANG; Chunli FAN; Zhenglin SUN; Hongtao ZHU

首页> 外文期刊>Turkish Journal of Electrical Engineering and Computer Sciences >Key word extraction for short text via word2vec, doc2vec, and textrank

【24h】

Key word extraction for short text via word2vec, doc2vec, and textrank

机译：通过Word2VEC，DOC2VEC和Textrank进行短文本的关键词提取

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Day by day huge amounts data are produced, and evaluation of these data becomes more difficult. The data obtained should provide meaningful, correct, and accurate information. Therefore, all data must be separated into clusters correctly, and the right information from these clusters must be obtained. Having the correct clusters depends on the clustering algorithm that is used. There are many clustering algorithms. The density-based methods are very important among the groups of clustering methods, as they can find arbitrary shapes. An advanced model of the density-based spatial clustering of applications with noise (DBSCAN) algorithm, called fuzzy neighborhood DBSCAN Gaussian means (FN-DBSCAN-GM), is offered in this study. The main contribution of FN-DBSCAN-GM is to find the parameters automatically and to divide the data into clusters robustly. The effectiveness of FN-DBSCAN-GM has been demonstrated on overlapping datasets (six artificial and two real-life datasets). The performances of these datasets are compared with the percentage of correct classification and validity index. Our experiments showed that this new algorithm was a preferable and robust algorithm.

机译：日益巨大的数量数据是生产的，并且对这些数据的评估变得更加困难。获得的数据应提供有意义的，正确和准确的信息。因此，所有数据必须正确分离为群集，必须获得来自这些集群的正确信息。具有正确的群集取决于使用的聚类算法。有许多聚类算法。基于密度的方法在聚类方法中非常重要，因为它们可以找到任意形状。本研究提供了一种具有噪声（DBSCAN）算法的基于密度的空间聚类的高级模型，称为模糊邻域DBSCAN Gaussian手段（FN-DBSCAN-GM）。 FN-DBSCAN-GM的主要贡献是自动查找参数并将数据划分为群集鲁棒化。已经在重叠的数据集（六个人工和两个实际数据集）上证明了FN-DBSCAN-GM的有效性。将这些数据集的性能与正确分类和有效性指数的百分比进行比较。我们的实验表明，这种新算法是一种优选且稳健的算法。

著录项

来源
《Turkish Journal of Electrical Engineering and Computer Sciences》 |2019年第3期|共12页
作者
Jun LI; Guimin HUANG; Chunli FAN; Zhenglin SUN; Hongtao ZHU;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词
Cluster analysisDBSCANFN-DBSCAN;

机译：群集分析DBSCANFN-DBSCAN;

相似文献

外文文献
中文文献
专利

1. Keyword extraction using supervised cumulative TextRank [J] . Monali Bordoloi, Preetam Chayan Chatterjee, Saroj Kumar Biswas, Multimedia Tools and Applications . 2020,第41a42期

机译：使用监督累积Textrank的关键字提取
2. Bidirectional Long Short Term Memory Method and Word2vec Extraction Approach for Hate Speech Detection [J] . Auliya Rahman Isnain, Agus Sihabuddin, Yohanes Suyanto Indonesian Journal of Computing and Cybernetics Systems . 2020,第2期

机译：双向长期内存方法和Word2Vec提取方法，用于仇恨语音检测
3. Inside Importance Factors of Graph-Based Keyword Extraction on Chinese Short Text [J] . Chen Junjie, Hou Hongxu, Gao Jing ACM transactions on Asian language information processing . 2020,第5期

机译：基于图形的基于图形关键字提取的重要因素
4. Chinese Text Keyword Extraction Based on Doc2vec And TextRank [C] . Wei Wang, Xiangshun Li, Sheng Yu Chinese Control and Decision Conference . 2020

机译：基于Doc2vec和TextRank的中文文本关键词提取
5. Identifying the gist of conversational text: Automatic keyword extraction and summarization. [D] . Liu, Fei. 2011

机译：识别对话文本的要点：自动关键词提取和汇总。
6. The Fractal Patterns of Words in a Text: A Method for Automatic Keyword Extraction [O] . Elham Najafi, Amir H. Darooneh -1

机译：文本中词的分形模式：一种自动关键词提取方法
7. Key word extraction for short text via word2vec, doc2vec, and textrank [O] . Jun LI, Guimin HUANG, Chunli FAN, 2019

机译：通过Word2VEC，DOC2VEC和Textrank进行短文本的关键词提取

Key word extraction for short text via word2vec, doc2vec, and textrank

摘要

著录项

相似文献

相关主题

期刊订阅