首页> 外文期刊>Journal of Systemics, Cybernetics and Informatics >Unsupervised Topic Labeling of Text Based on Wikipedia Categorization
【24h】

Unsupervised Topic Labeling of Text Based on Wikipedia Categorization

机译:根据维基百科分类的无监督主题标签文本

获取原文
获取外文期刊封面目录资料

摘要

Defining text topicality is often an expensive problem thatrequires significant resources for text labeling. Though manypackages already exist that provide dictionaries of labeled text,synonyms, and Part-of-Speach tagging, the problem is ongoingas language develops and new meanings of words and phrasesemerge. This paper proposes a cheap in human labor solution totopic labeling of any text in the majority of languages. Themethodology uses links to the naturally emerging corpus oflabeled text – the Wikipedia. Wikipedia categories areprocessed to extract a weighted set of topic labels for theanalyzed text. The approach is evaluated by processingcategorized texts and comparing the similarity of the top ranksof topic labels to the text category. The topic labels extractedusing this methodology can be used for comparing similarity oftexts, for the assessment of the completeness of topic coveragein automated marking of essays, and for coding in qualitativetext analysis. The paper contributes to the field of NLP byoffering a cheap and organically developing method of topicaltext labeling. The paper contributes to the work of qualitativeanalysts by offering a methodology for the analysis of interviewtranscripts and other unstructured text.
机译:定义文本局部性通常是一个昂贵的问题,即大量资源进行文本标签。虽然已经存在了ManeMAckages,但提供标记文本的词典,同义词和跨越式标记,问题是ondorasas语言的开发和新含义的单词和短语的含义。本文提出了廉价的人工劳动解决方案,对大多数语言中的任何文本都有廉价标签。 Themethodology使用与标签文本的自然新兴语料库的链接 - 维基百科。 Wikipedia类别是对Theanalyzed文本提取的加权主题标签。该方法是通过处理传言文本来评估的,并将顶部RupeSof主题标签的相似性进行评估到文本类别。提取该方法的主题标签可以用于比较IFTexts的相似性,以评估主题覆盖物的完整性散文的完整性,以及在质量上的分析中编码。本文有助于NLP offfering一种廉价和有机开发的主题文本标记方法。本文通过为采访者和其他非结构化文本提供分析方法,有助于质量的工作。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号