Score Normalization Methods Applied to Topic Identification

机译：分数标准化方法适用于主题识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Multi-label classification plays the key role in modern categorization systems. Its goal is to find a set of labels belonging to each data item. In the multi-label document classification unlike in the multi-class classification, where only the best topic is chosen, the classifier must decide if a document does or does not belong to each topic from the predefined topic set. We are using the generative classifier to tackle this task, but the problem with this approach is that the threshold for the positive classification must be set. This threshold can vary for each document depending on the content of the document (words used, length of the document,...). In this paper we use the Unconstrained Cohort Normalization, primary proposed for speaker identification/verification task, for robustly finding the threshold defining the boundary between the correc and the incorrect topics of a document. In our former experiments we have proposed a method for finding this threshold inspired by another normalization technique called World Model score normalization. Comparison of these normalization methods has shown that better results can be achieved from the Unconstrained Cohort Normalization.

机译：多标签分类在现代分类系统中起着关键作用。其目标是找到属于每个数据项的一组标签。在多级分类中的多级文档分类中，只有选择最佳主题，分类器必须决定文件是否属于或不属于预定义主题集的每个主题。我们正在使用生成分类器来解决此任务，但此方法的问题是必须设置正面分类的阈值。每个文档可以根据文档的内容（文件的长度，文档的长度）而异。在本文中，我们使用不受约束的队列标准化，提出了用于扬声器识别/验证任务的初级，用于鲁棒地找到定义界限之间的阈值和文档的错误主题。在我们的前实验中，我们提出了一种通过另一种归一化技术的发现激发了这种阈值，称为世界模型分数标准化。这些归一化方法的比较表明，可以从未约束的队列标准化实现更好的结果。

著录项

来源
《International Conference on Text, Speech and Dialogue》|2014年||共8页
会议地点
作者
Lucie Skorkovska; Zbynek Zajic;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP391.1-53;
关键词
Topic identification; Multi-label text classification; Naive Bayes classification; Score normalization;

机译：主题识别;多标签文本分类;天真贝叶斯分类;得分标准化;

相似文献

外文文献
中文文献
专利

1. Score normalization applied to adaptive biometric systems [J] . Paulo Henrique Pisani, Norman Poh, Andre C.P.L.F. de Carvalho, Computers & Security . 2017,第sepa期

机译：分数归一化应用于自适应生物识别系统
2. TAP score: torsion angle propensity normalization applied to local protein structure evaluation [J] . Silvio CE Tosatto, Roberto Battistutta BMC Bioinformatics . 2007,第1期

机译：TAP评分：扭转角倾向归一化在局部蛋白质结构评估中的应用
3. A unified framework for score normalization techniques applied to text-independent speaker verification [J] . Mariethoz J., Bengio S. IEEE signal processing letters . 2005,第7期

机译：适用于与文本无关的说话人验证的分数归一化技术的统一框架
4. Score Normalization Methods Applied to Topic Identification [C] . Lucie Skorkovska, Zbynek Zajic . 2014

机译：分数归一化方法应用于主题识别
5. New Models and Methods for Applied Statistics: Topics in Computer Experiments and Time Series Analysis [D] . Zhao, Yibo. 2017

机译：应用统计的新模型和方法：计算机实验和时间序列分析中的主题
6. Effect of various normalization methods on Applied Biosystems expression array system data [O] . Catalin C Barbacioru, Yulei Wang, Roger D Canales, 2006

机译：各种标准化方法对Applied Biosystems表达阵列系统数据的影响
7. The effects of scoring method, topic, and mode on grade 12 studentsu27 writing scores [O] . Carlman Nancy 1984

机译：计分方法，主题和方式对12年级学生的写作成绩的影响

Score Normalization Methods Applied to Topic Identification

摘要

著录项

相似文献

相关主题

期刊订阅