首页> 外文会议>Chinese lexical semantics workshop >A Classification Method for Chinese Word Semantic Relations Based on TF-IDF and CNN

【24h】

A Classification Method for Chinese Word Semantic Relations Based on TF-IDF and CNN

机译：基于TF-IDF和CNN的汉语词语义关系分类方法。

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The classification of semantic relations between words is an important part of semantic analysis in natural language research. The automatic achievement of this classification is of significance to construction of the Knowledge Graph and Information Retrieval. In NLPCC2017 shared task on Chinese Word Semantic Relations Classification, the semantic relations have been classified into four categories: synonym, antonym, hyponymy and mer-onym. This paper presents a classification method for Chinese word semantic relations based on TF-IDF and CNN, and uses words' literal and semantic features. Four new literal features are proposed including whether a word is part of another word and the ratio of their common substring. The extraction of semantic features is a four-step process- training a vector model of words on BaiduBaike Corpus, selecting a set of words most related to a given word from BaiduBaike based on TF-IDF, constructing a vector matrix for the set of related words, and using CNN to get the semantic features of the given word from the vector matrix. The experiment on the NLPCC2017 dataset demonstrates that the F_1-score is up to 83.91%, which proves effective to eliminate the influence of the OOV words.

机译：词之间语义关系的分类是自然语言研究中语义分析的重要组成部分。这种分类的自动实现对知识图谱和信息检索的构建具有重要意义。在NLPCC2017的“汉语单词语义关系分类”共享任务中，语义关系已分为四类：同义词，反义词，下位词和人名。本文提出了一种基于TF-IDF和CNN的汉语单词语义关系分类方法，并利用单词的字面意义和语义特征。提出了四个新的文字特征，包括一个单词是否是另一个单词的一部分以及它们共同子串的比率。语义特征的提取是一个四步过程，即在BaiBaBaike语料库上训练单词矢量模型，基于TF-IDF从BaiBaBaike中选择与给定单词最相关的单词集，为相关集构建向量矩阵单词，然后使用CNN从向量矩阵中获取给定单词的语义特征。在NLPCC2017数据集上的实验表明F_1分数高达83.91％，这被证明可以有效消除OOV单词的影响。

著录项

来源
《Chinese lexical semantics workshop 》|2018年|509-518|共10页
会议地点
作者
Teng Mao; Yuanyuan Peng; Yuru Jiang; Yangsen Zhang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Semantic relations; CNN; TF-IDF;

机译：语义关系; CNN;特遣部队;

相似文献

外文文献
中文文献
专利

1. Hyperspectral Image Classification Method Based on CNN Architecture Embedding With Hashing Semantic Feature [J] . Yu Chunyan, Zhao Meng, Song Meiping, Selected Topics in Applied Earth Observations and Remote Sensing, IEEE Journal of . 2019 ,第6期

机译：基于CNN架构并嵌入哈希特征的高光谱图像分类方法
2. Chinese WeChat and Blog Hot Words Detection Method Based on Chinese Semantic Clustering [J] . Wang Yu, Song Sixin, Zhou Fanfan, Intelligent automation and soft computing . 2017 ,第4期

机译：基于中文语义聚类的中文微信和博客热门词检测方法
3. DCT-CNN-based classification method for the Gongbi and Xieyi techniques of Chinese ink-wash paintings [J] . Jiang Wei, Wang Zheng, Jin Jesse S., Neurocomputing . 2019 ,第FEBa22期

机译：基于DCT-CNN的中国水墨画工笔画和斜笔画分类方法
4. A Classification Method for Chinese Word Semantic Relations Based on TF-IDF and CNN [C] . Teng Mao, Yuanyuan Peng, Yuru Jiang, Chinese Lexical Semantics Workshop . 2018

机译：基于TF-IDF和CNN的中文语义关系的分类方法
5. Augmented Dual Input CNN (DI-CNN) for the Diagnostic Classification of Lung Nodule Malignancy from CT Scans [D] . Jain, Arshita. 2020

机译：增强双输入CNN（DI-CNN），用于CT扫描的肺结结恶性肿瘤诊断分类
6. Question classification based on Bloom’s taxonomy cognitive domain using modified TF-IDF and word2vec [O] . Manal Mohammed, Nazlia Omar 2020

机译：基于Bloom的分类学认知域使用修改的TF-IDF和Word2VEC的问题分类
7. A PROPOSAL FOR KANSEI WORDS SELECTING METHOD BASED ON MORPHEME ANALYSIS AND TF-IDF [O] . Takahiro AKABANE, Rie NOZAWA, Suzuka KOSUGI, 2009

机译：一种基于语素分析和TF-IDF的选择方法 kansei 词的提案

A Classification Method for Chinese Word Semantic Relations Based on TF-IDF and CNN

摘要

著录项

相似文献

相关主题

期刊订阅