τ-SS3: A text classifier with dynamic n-grams for early risk detection over text streams

Burdisso Sergio G.; Errecalde Marcelo; Montes-y-Gomez Manuel

首页> 外文期刊>Pattern recognition letters >τ-SS3: A text classifier with dynamic n-grams for early risk detection over text streams

【24h】

τ-SS3: A text classifier with dynamic n-grams for early risk detection over text streams

机译：τ-ss3：具有动态n-gram的文本分类器，用于早期风险检测文本流

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

A recently introduced classifier, called SS3, has shown to be well suited to deal with early risk detection (ERD) problems on text streams. It obtained state-of-the-art performance on early depression and anorexia detection on Reddit in the CLEF's eRisk open tasks. SS3 was designed to deal with ERD problems naturally since: it supports incremental training and classification over text streams, and it can visually explain its rationale. However, SS3 processes the input using a bag-of-word model lacking the ability to recognize important word sequences. This aspect could negatively affect the classification performance and also reduces the descriptiveness of visual explanations. In the standard document classification field, it is very common to use word n-grams to try to overcome some of these limitations. Unfortunately, when working with text streams, using n-grams is not trivial since the system must learn and recognize which n-grams are important "on the fly". This paper introduces tau-SS3, an extension of SS3 that allows it to recognize useful patterns over text streams dynamically. We evaluated our model in the eRisk 2017 and 2018 tasks on early depression and anorexia detection. Experimental results suggest that tau-SS3 is able to improve both current results and the richness of visual explanations. (C) 2020 Elsevier B.V. All rights reserved.

机译：最近引入的分类器称为SS3，已经非常适合处理文本流上的早期风险检测（ERD）问题。它在克利夫夫的ERISK打开任务中获得了早期抑郁症和厌食检测的最新性能。 SS3旨在自然地处理ERD问题，因为它支持对文本流的增量培训和分类，它可以在视觉上解释其理由。然而，SS3使用缺乏识别重要词序列的权能的单词模型来处理输入。这方面可能对分类性能产生负面影响，并且还减少了视觉解释的描述。在标准文档分类字段中，使用Word N-Grams尝试克服这些限制是非常常见的。遗憾的是，在使用文本流时，使用n-grams并不琐碎，因为系统必须学习和识别哪个n-gram是重要的“飞行”。本文介绍了TAU-SS3，SS3的延伸，允许它动态地识别文本流中的有用模式。我们在早期抑郁和厌食检测中评估了我们在Erisk 2017和2018任务中的模型。实验结果表明，TAU-SS3能够改善目前的结果和视觉解释的丰富性。（c）2020 Elsevier B.v.保留所有权利。

著录项

来源
《Pattern recognition letters》 |2020年第10期|130-137|共8页
作者
Burdisso Sergio G.; Errecalde Marcelo; Montes-y-Gomez Manuel;
展开▼
作者单位

Univ Nacl San Luis UNSL Ejercito Los Andes 950 RA-5700 San Luis San Luis Argentina|Consejo Nacl Invest Cient & Tecn CONICET Buenos Aires DF Argentina;

Univ Nacl San Luis UNSL Ejercito Los Andes 950 RA-5700 San Luis San Luis Argentina;

Inst Nacl Astrofis Opt & Elect INAOE Luis Enrique Erro 1 Puebla 72840 Mexico;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Early text classification; Dynamic word n-grams; Incremental classification; SS3; Explainability; Trie; Digital tree;

机译：早期文本分类;动态词n-gram;增量分类;SS3;解释性;TRIE;数字树;

相似文献

外文文献
中文文献
专利

1. CNN-based text multi-classifier using filters initialised by N-gram vector [J] . Yan Xiang, Ying Xu, Zhengtao Yu, International Journal of Information and Communication Technology . 2019,第4期

机译：基于CNN的文本多分类器使用N-GRAM向量初始化的滤波器
2. Integration Of N-gram Language Models Inmultiple Classifier Systems For Offline handwritten Text Line Recognition [J] . ROMAN BERTOLAMI, HORST BUNKE International Journal of Pattern Recognition and Artificial Intelligence . 2008,第7期

机译：N-gram语言模型在多个分类器系统中的集成，用于离线手写文本行识别
3. Dynamic classifier ensemble for positive unlabeled text stream classification [J] . Shirui Pan, Yang Zhang, Xue Li Knowledge and information systems . 2012,第2期

机译：动态分类器集成，用于积极的未标记文本流分类
4. CodeX: Combining an SVM Classifier and Character N-gram Language Models for Sentiment Analysis on Twitter Text [C] . Qi Han, Junfei Guo, Hinrich Schuetze 7th International workshop on semantic evaluation . 2013

机译：CodeX：结合SVM分类器和字符N元语法模型在Twitter文本上进行情感分析
5. InfoFilter: Complex pattern specification and detection over text streams. [D] . Elkhalifa, Laali. 2004

机译：InfoFilter：复杂的模式规范和文本流检测。
6. Comparison of Machine Learning Classifiers for Influenza Detection from Emergency Department Free-text Reports [O] . Arturo López Pineda, Ye Ye, Shyam Visweswaran, -1

机译：从急诊科自由文本报告中检测流感的机器学习分类器的比较
7. τ-SS3: A text classifier with dynamic n-grams for early risk detection over text streams [O] . Sergio G. Burdisso, Marcelo Errecalde, Manuel Montes-y-Gómez 2020

机译：τ-ss3：具有动态n-gram的文本分类器，用于早期风险检测文本流

τ-SS3: A text classifier with dynamic n-grams for early risk detection over text streams

摘要

著录项

相似文献

相关主题

期刊订阅