Text coherence new method using word2vec sentence vectors and most likely n-grams

机译：文本连贯新方法使用Word2Vec句子向量，很可能是n-grams

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Discourse coherence modeling evaluation remains a challenge task in all Natural Language Processing subfields. Most proposed approaches focus on feature engineering, which accepts the sophisticated features to capture the logic, syntactic or semantic relationships between all sentences within a text. This paper investigates the automatic evaluation of text coherence. We introduce a fully-automatic rich statistical model of local and global coherence that uses word2vec approach to assess the coherence a document. Our modeling approach relies on numerical vectors derived from word2vec algorithm applied on a very large collection of texts. We successfully combined the word2vec vectors and most likely n-grams with cohesive LD-n-grams perplexity to assess the coherence and topic integrity of document. We present experimental results that assess the predictive power that it does not depend on the language and its semantic concepts. So it has the ability to apply on any language. Our model achieves state-of-the-art performance in coherence evaluation and order discrimination task on two datasets widely used in the previous methods.

机译：话语一致性建模评估仍然是所有自然语言处理子场中的挑战任务。大多数提议的方法都侧重于特征工程，它接受了复杂的功能，以捕获文本中所有句子之间的逻辑，句法或语义关系。本文调查了文本连贯性的自动评估。我们介绍了一个全自动丰富的本地和全局一致性统计模型，使用Word2VEC方法来评估一致文件。我们的建模方法依赖于来自Word2Vec算法的数值vec in应用于非常大的文本集合。我们成功地将Word2Vec向量组合起来，很可能是具有凝聚力的LD-N-GRAM的N-GRAM困惑，以评估文档的一致性和主题完整性。我们提出了实验结果，评估了它不依赖于语言及其语义概念的预测力。因此它有能力申请任何语言。我们的模型在两种数据集中的一致性评估和顺序辨别任务中实现了最先进的性能，以前的两个数据集。

著录项

来源
《International Conference of Signal Processing and Intelligent Systems》|2017年|198p|共5页
会议地点
作者
Mohamad Abdolahi Kharazmi; Morteza Zahedi Kharazmi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN911.7-53;
关键词
Coherence; Hidden Markov models; Semantics; Computational modeling; Task analysis; Matrix converters; Text processing;

机译：一致性;隐马尔可夫模型;语义;计算建模;任务分析;矩阵转换器;文本处理;

相似文献

外文文献
中文文献
专利

1. Text Document Categorization using Enhanced Sentence Vector Space Model and Bi-Gram Text Representation Model Based on Novel Fusion Techniques [J] . Abdisa Demissie Amensisa New Media and Mass Communication . 2020,第4期

机译：基于新型融合技术的基于增强句子矢量空间模型和双革文本表示模型的文本文档分类
2. CNN-based text multi-classifier using filters initialised by N-gram vector [J] . Yan Xiang, Ying Xu, Zhengtao Yu, International Journal of Information and Communication Technology . 2019,第4期

机译：基于CNN的文本多分类器使用N-GRAM向量初始化的滤波器
3. N-gram support vector machines for scalable procedure and diagnosis classification, with applications to clinical free text data from the intensive care unit [J] . MarafinoB.J., DaviesJ.M., BardachN.S., Journal of the American Medical Informatics Association : . 2014,第5期

机译：N-gram支持向量机，用于可扩展的过程和诊断分类，适用于重症监护室的临床免费文本数据
4. Text coherence new method using word2vec sentence vectors and most likely n-grams [C] . Mohamad Abdolahi Kharazmi, Morteza Zahedi Kharazmi 2017 3rd Iranian Conference on Intelligent Systems and Signal Processing . 2017

机译：使用word2vec句子向量和最可能的n-gram的文本连贯新方法
5. Methods of Sentence Extraction, Abstraction and Ordering for Automatic Text Summarization [D] . Nayeem, Mir Tafseer. 2017

机译：自动文本摘要的句子提取，提取和排序方法
6. Research and applications: N-gram support vector machines for scalable procedure and diagnosis classification with applications to clinical free text data from the intensive care unit [O] . Ben J Marafino, Jason M Davies, Naomi S Bardach, 2014

机译：研究与应用：用于可扩展程序和诊断分类的N-gram支持向量机适用于重症监护室的临床免费文本数据
7. Review of Text Reduction Algorithms and Text Reduction using Sentence Vectorization [O] . Sneh Garg, C-dac Mohali, Sunil Chhillar 2015

机译：利用句子矢量化研究文本约简算法和文本约简

Text coherence new method using word2vec sentence vectors and most likely n-grams

摘要

著录项

相似文献

相关主题

期刊订阅