A Corpus-based Approach for Keyword Identification using Supervised Learning Techniques

机译：基于语料库的关键字识别方法使用受监督学习技术

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents a corpus-based approach for extracting keywords from a text written in a language that has no word boundary. Based on the concept of Thai character cluster, a Thai running text is preliminarily segmented into a sequence of inseparable units, called TCCs. To enable the handling of a large-scaled text, a sorted sistring (or suffix array) is applied to calculate a number of statistics of each TCC. Using these statistics, we applied three alternative supervised machine learning techniques, naive Bayes, centroid-based and k-NN, to learn classifiers for keyword identification. Our method is evaluated using a medical text extracted from WWW. The result showed that k-NN achieves the highest performance of 79.5% accuracy.

机译：本文介绍了一种基于语料库的方法，用于从以没有单词边界写入的语言写入的文本中提取关键字。基于泰国字符集群的概念，泰国运行文本被预先分割成一系列不可分割的单位，称为TCC。要启用大规模文本的处理，应用了排序的频带（或后缀数组）来计算每个TCC的许多统计信息。使用这些统计数据，我们应用了三种替代监督机器学习技术，天真贝叶斯，基于质心和K-NN，学习用于关键字识别的分类器。我们的方法使用从WWW中提取的医疗文本进行评估。结果表明，K-NN精度的最高性能为79.5％。

著录项

来源
《International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology》|2008年||共4页
会议地点
作者
Jakkrit TeCho; Cholwich Nattee; Thanaruk Theeramunkong;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP3-53;
关键词

相似文献

外文文献
中文文献
专利

1. Context-sensitive and keyword density-based supervised machine learning techniques for malicious webpage detection [J] . Altay Betul, Dokeroglu Tansel, Cosar Ahmet Soft computing: A fusion of foundations, methodologies and applications . 2019,第12期

机译：基于背景和关键字的基于密度的受恶意网页检测的监督机器学习技术
2. Enhancing the Performance of Sentiment Analysis Supervised Learning Using Sentiments Keywords Based Technique [J] . Amira Abdelwahab, Fahd Alqasemi, Hatem Abdelkader Computer Science & Information Technology . 2017,第1期

机译：使用基于情感关键词的技术提高情感分析监督学习的性能
3. Automatic cephalometric landmarks detection on frontal faces: An approach based on supervised learning techniques [J] . Porto Lucas Faria, Correia Lima Laise Nascimento, Pinheiro Flores Marta Regina, Digital investigation . 2019,第Sepa期

机译：额头面部的自动头颅标志物检测：一种基于监督学习技术的方法
4. A Corpus-based Approach for Keyword Identification using Supervised Learning Techniques [C] . Jakkrit TeCho, Cholwich Nattee, Thanaruk Theeramunkong International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology . 2008

机译：基于语料库的关键字识别方法使用受监督学习技术
5. Semi-Supervised Machine Learning Techniques for Classification of Evolving Data in Pattern Recognition =TECHNIQUES SEMI-SUPERVISéES D'APPRENTISSAGE MACHINE POUR LA CLASSIFICATION DES DONNéES EN éVOLUTION EN RECONNAISSANCE DE FORMES [D] . Tencer, Lukas. 2017

机译：半监督机器学习技术，用于模式识别中不断发展的数据分类=在表单识别中对数据进行分类的半监督机器学习技术
6. New Approach for Risk Estimation Algorithms of BRCA1/2 Negativeness Detection with Modelling Supervised Machine Learning Techniques [O] . Hulya Yazici, Demet Akdeniz Odemis, Dogukan Aksu, 2020

机译：BRCA1 / 2消极检测风险估计算法的新方法采用造型监督机学习技术
7. SUPERVISED LEARNING APPROACH FOR BRAIN STROKE CLASSIFICATION USING DEEP LEARNING TECHNIQUES [O] . S.Keerthana . 2016

机译：利用深层学习技术监督脑卒中分类的学习方法

A Corpus-based Approach for Keyword Identification using Supervised Learning Techniques

摘要

著录项

相似文献

相关主题

期刊订阅