Exploring Textual Features for Multi-label Classification of Portuguese Film Synopses

机译：探索葡萄牙电影概要的多标签分类的文本特征

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The multi-label classification of film genres by using features extracted from their synopses has recently gained some attention from the scientific community, however, the number of studies is still limited. These studies are even scarcer for languages other than English. In this work we present the P-TMDb dataset, which contains 13,394 Portuguese film synopses, and explore the film genre classification by experimenting with nine different groups of textual features and four multi-label algorithms. As our dataset is unbalanced, we also conducted experiments with an oversampled version of the dataset. The best result obtained for the original dataset was achieved by a TF-IDF based classifier, presenting an average Fl score of 0.478, while the best result for the oversampled dataset was achieved by a combination of several feature groups and presented an average Fl score of 0.611.

机译：使用从他们的联系中提取的特征的薄膜类型的多标签分类最近从科学界获得了一些关注，然而，研究的数量仍然有限。这些研究甚至是英语以外的语言的稀缺。在这项工作中，我们介绍了P-TMDB数据集，其中包含13,394部葡萄牙电影组件，并通过尝试九个不同的文本特征和四个多标签算法来探索电影类型分类。随着我们的数据集不平衡，我们还通过数据集的过采样版本进行了实验。由基于TF-IDF基于TF-IDF的分类器实现的最佳结果，呈现平均流量为0.478，而过采样数据集的最佳结果是通过多个特征组的组合实现，并呈现平均流量0.611。

著录项

来源
《EPIA Conference on Artificial Intelligence》|2019年|xxxix 785 p.|共13页
会议地点
作者
Giuseppe Portolese; Marcos Aurelio Domingues; Valeria Delisandra Feltrim;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类人工智能理论;
关键词
Multi-label classification; Film genre; Textual features; Natural Language Processing;

机译：多标签分类;电影类型;文本特征;自然语言处理;

相似文献

外文文献
中文文献
专利

1. FS-MLC: Feature selection for multi-label classification using clustering in feature space [J] . Nitin Kumar Mishra, Pramod Kumar Singh Information Processing & Management . 2020,第4期

机译：FS-MLC：使用聚类功能空间中的多标签分类功能选择
2. Recurrently exploring class-wise attention in a hybrid convolutional and bidirectional LSTM network for multi-label aerial image classification [J] . Hua Yuansheng, Mou Lichao, Zhu Xiao Xiang ISPRS Journal of Photogrammetry and Remote Sensing . 2019,第MARa期

机译：反复探索卷积和双向LSTM混合网络中用于多标签航空图像分类的类关注
3. Exploring an Ensemble of Textual Machine Learning Methodologies for Traffic Event Detection and Classification [J] . Konstantinos Kokkinos, Eftihia Nathanail Transport and Telecommunication Journal . 2020,第4期

机译：探索交通事件检测和分类的文本机器学习方法的集合
4. Exploring Textual Features for Multi-label Classification of Portuguese Film Synopses [C] . Giuseppe Portolese, Marcos Aurelio Domingues, Valeria Delisandra Feltrim EPIA Conference on Artificial Intelligence . 2019

机译：探索葡萄牙语电影摘要的多标签分类的文本特征
5. XML document classification using structural and textual features. [D] . Khabbazhaye Tajer, Mohammad. 2008

机译：使用结构和文本功能对XML文档进行分类。
6. Recurrently exploring class-wise attention in a hybrid convolutional and bidirectional LSTM network for multi-label aerial image classification [O] . Yuansheng Hua, Lichao Mou, Xiao Xiang Zhu -1

机译：反复探索卷积和双向LSTM混合网络中用于多标签航空图像分类的类关注
7. New Multi-Label Correlation-Based Feature Selection Methods for Multi-Label Classification and Application in Bioinformatics [O] . Jungjit Suwimol 2016

机译：基于多标签相关性的多标签分类新特征选择方法及其在生物信息学中的应用

Exploring Textual Features for Multi-label Classification of Portuguese Film Synopses

摘要

著录项

相似文献

相关主题

期刊订阅