...
首页> 外文期刊>電子情報通信学会技術研究報告. オフィスシステム >The document categorization by the vector space model using lexical different area Co-occurrence
【24h】

The document categorization by the vector space model using lexical different area Co-occurrence

机译:矢量空间模型使用词汇不同区域共发生的文档分类

获取原文
获取原文并翻译 | 示例
           

摘要

We're asked for high quality classification of documents, even if document sets have field polysemy. A noun pair with 0.15 to 0.35 or more Euclid distance, between two nouns which appear in the same sentence of a document, is called a lexical different area co-occurrence, and we propose the document categorization using this method. Using a newspaper an industry magazine, and a paper as a document set, we prove that, (1) We acquired about 90% of accuracy by the newspaper. (2) A lexical different area co-occurrence developed the 15% distance between fields than the single noun.
机译:我们要求提供高质量的文档分类,即使文件集具有现场多义。 与0.15到0.35或更多的欧几克里距离的名词对,在文档的同一句子中出现的两个名词之间,称为词汇分类,我们使用此方法提出文档分类。 使用报纸是一个行业杂志,以及作为文件集的文件,我们证明了(1)我们在报纸上获得了大约90%的准确性。 (2)词汇不同的区域共同发生在场之间的距离超过单个名词。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号