Cuisine: Classification Using Stylistic Feature Sets and/or Name-Based Feature Sets

Yaakov HaCohen-Kerner; Hananya Beck; Elchai Yehudai; Mordechay Rosenstein; Dror Mughaz

首页> 外文期刊>Journal of the American Society for Information Science and Technology >Cuisine: Classification Using Stylistic Feature Sets and/or Name-Based Feature Sets

【24h】

Cuisine: Classification Using Stylistic Feature Sets and/or Name-Based Feature Sets

机译：美食：使用风格特征集和/或基于名称的特征集进行分类

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Document classification presents challenges due to the large number of features, their dependencies, and the large number of training documents. In this research, we investigated the use of six stylistic feature sets (including 42 features) and/or six name-based feature sets (including 234 features) for various combinations of the following classification tasks: ethnic groups of the authors and/or periods of time when the documents were written and/or places where the documents were written. The investigated corpus contains Jewish Law articles written in Hebrew-Aramaic, which present interesting problems for classification. Our system CUISINE (Classification Using Stylistic feature sets and/or NamE-based feature sets) achieves accuracy results between 90.71 to 98.99% for the seven classification experiments (ethnicity, time, place, ethnicity&time, ethnicity&place, time&place, ethnicity&time&place). For the first six tasks, the stylistic feature sets in general and the quantitative feature set in particular are enough for excellent classification results. In contrast, the name-based feature sets are rather poor for these tasks. However, for the most complex task (ethnicity&time&place), a hill-climbing model using all feature sets succeeds in significantly improving the classification results. Most of the stylistic features (34 of 42) are language-independent and domain-independent. These features might be useful to the community at large, at least for rather simple tasks.

机译：由于大量的功能，它们的依赖性以及大量的培训文档，文档分类提出了挑战。在这项研究中，我们调查了以下分类任务的各种组合使用了六个样式特征集（包括42个特征）和/或六个基于名称的特征集（包括234个特征）：作者的种族和/或时期文件写入的时间和/或文件写入的时间。被调查的语料库包含用希伯来语-阿拉姆语撰写的犹太法律文章，这些文章为分类提出了有趣的问题。我们的系统CUISINE（使用样式特征集和/或基于NamE的特征集进行分类）在七个分类实验（种族，时间，地点，种族和时间，种族和地方，时间和地方，种族和时间和地方）中达到了90.71％到98.99％的准确性结果。对于前六个任务，通常的样式特征集，尤其是定量特征集足以获得出色的分类结果。相反，基于名称的功能集在这些任务上就很差。但是，对于最复杂的任务（种族，时间和地点），使用所有特征集的爬坡模型都可以成功地显着改善分类结果。大多数风格特征（42个中的34个）与语言和领域无关。这些功能可能对整个社区有用，至少对于相当简单的任务而言。

著录项

来源
《Journal of the American Society for Information Science and Technology》 |2010年第8期|P.1644-1657|共14页
作者
Yaakov HaCohen-Kerner; Hananya Beck; Elchai Yehudai; Mordechay Rosenstein; Dror Mughaz;
展开▼
作者单位

Department of Computer Science, Jerusalem College of Technology (Machon Lev), 21 Havaad Haleumi Street, P.O.B. 16031, 91160 Jerusalem, Israel;

rnDepartment of Computer Science, Jerusalem College of Technology (Machon Lev), 21 Havaad Haleumi Street, P.O.B. 16031, 91160 Jerusalem, Israel;

rnDepartment of Computer Science, Jerusalem College of Technology (Machon Lev), 21 Havaad Haleumi Street, P.O.B. 16031, 91160 Jerusalem, Israel;

rnDepartment of Computer Science, Jerusalem College of Technology (Machon Lev), 21 Havaad Haleumi Street, P.O.B. 16031, 91160 Jerusalem, Israel;

Department of Computer Science, Bar-Han University, 52900 Ramat-Gan, Israel and Department of Computer Science, Jerusalem College of Technology (Machon Lev);

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. STYLISTIC FEATURE SETS AS CLASSIFIERS OF DOCUMENTS ACCORDING TO THEIR HISTORICAL PERIOD AND ETHNIC ORIGIN [J] . Yaakov HaCohen-Kerner, Hananya Beck, Elchai Yehudai, Applied Artificial Intelligence . 2010,第8a10期

机译：根据其历史时期和民族起源将文体特征设置为文档的分类
2. Neural Techniques for Improving the Classification Accuracy of Microarray Data Set using Rough Set Feature Selection Method [J] . Bichitrananda Patra, Sujata Dash, B. K. Tripathy International Journal of Computer Trends and Technology . 2013,第3期

机译：粗糙集特征选择方法提高微阵列数据分类精度的神经技术
3. Multiple feature set with feature selection for anomaly search in videos using hybrid classification [J] . Srinivasan A., Gnanavel V. K. Multimedia Tools and Applications . 2019,第6期

机译：具有特征选择功能的多重特征集，用于使用混合分类在视频中进行异常搜索
4. Identifying Historical Period and Ethnic Origin of Documents Using Stylistic Feature Sets [C] . Yaakov HaCohen-Kerner, Hananya Beck, Elchai Yehudai, Discovery Science; Lecture Notes in Artificial Intelligence; 4265 . 2006

机译：使用文体特征集识别文件的历史时期和民族起源
5. Feature Set Selection for Improved Classification of Static Analysis Alerts [D] . ?Goeschel, Kathleen 2019

机译：功能设置选择，以改进静态分析警报分类
6. Classification of breast masses in ultrasound images using self-adaptive differential evolution extreme learning machine and rough set feature selection [O] . Kadayanallur Mahadevan Prabusankarlal, Palanisamy Thirumoorthy, Radhakrishnan Manavalan 2017

机译：使用自适应差分进化极限学习机和粗糙集特征选择对超声图像中的乳腺肿块进行分类
7. Table 5: Accuracy of SVM classifier on DBAP features derived from pre-trained LeNet with DBAP layer. The DBAP features show better classification results than the MaxPool features in LeNet. The fully connected (FC) layers of LeNet with DBAP also tend to show better discrimination ability as compared to FC layer features extracted from regular LeNet on all benchmark data sets. [O] . -1

机译：表5：SVM分类器对具有DBAP层的预先训练的Lenet派生的DBAP特征的精度。 DBAP功能显示比LENET中的MAXPOOL功能更好的分类结果。与DBAP的完全连接（FC）LENET的层也倾向于显示与在所有基准数据集上从常规LENET提取的FC层特征相比显示出更好的辨别能力。

Cuisine: Classification Using Stylistic Feature Sets and/or Name-Based Feature Sets

摘要

著录项

相似文献

相关主题

期刊订阅