首页> 外国专利> APPARATUS AND METHOD FOR VALIDATING SELF-PROPAGATED UNETHICAL TEXT

APPARATUS AND METHOD FOR VALIDATING SELF-PROPAGATED UNETHICAL TEXT

机译:验证自我传播的不道德文本的装置和方法

摘要

The present invention provides an apparatus for verifying the validity of a text and a method thereof. The apparatus includes: a text acquisition unit that obtains a plurality of self-replicating texts generated in a self-replicating manner by using texts for learning having been verified in advance for ethical or unethical properties; a dictionary-based discrimination unit that receives the self-proliferation text and searches for words similar to the profanity registered in the profanity dictionary obtained in advance from the approved self-replicating text and a pre-determined level or higher, and determines unethical properties of the self-replicating text; a learning model-based discrimination unit that receives the self-replicating text, vectorizes the text in units of words, and extracts sentence feature vectors from the vectorized words according to a previously learned pattern estimation method to determine the unethical properties of the self-replicating text; an original text-based discrimination unit that searches for text for learning most similar to the text for self-replicating, and determines the unethical properties of the text for self-replicating according to the label of the searched text for learning; and a discrimination result comparison unit that obtains the final discrimination result for the self-replicating text by combining the discrimination result of the unethical properties of the self-replicating text determined by each of the dictionary-based discrimination unit, the learning model-based discrimination unit, and the original text-based discrimination unit.
机译:本发明提供了一种用于验证文本的有效性的装置及其方法。该设备包括:文本获取单元,该文本获取单元通过使用已经预先验证了伦理或不道德特性的用于学习的文本来获得以自我复制的方式生成的多个自我复制的文本;以及基于字典的判别单元,其接收自我扩散文本并搜索与预先批准的自我复制文本和预定级别或更高级别中获得的,在亵渎词典中注册的亵渎相似的词,并确定自复制文本;一种基于学习模型的判别单元,该单元接收自我复制的文本,以单词为单位对文本进行矢量化处理,并根据先前学习的模式估计方法从矢量化的单词中提取句子特征向量,以确定自我复制的不道德特性文本;原始的基于文本的区分单元,其搜索与自复制文本最相似的学习文本,并根据搜索到的自学习文本的标签确定自复制文本的不道德特性;鉴别结果比较单元,其通过组合由基于字典的鉴别单元,基于学习模型的鉴别中的每一个确定的自我复制文本的不道德特性的鉴别结果,获得针对自我复制文本的最终鉴别结果。单位和基于文本的原始区分单位。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号