首页> 外国专利> PRIVACY PRESERVATION IN A QUERYABLE DATABASE BUILT FROM UNSTRUCTURED TEXTS

PRIVACY PRESERVATION IN A QUERYABLE DATABASE BUILT FROM UNSTRUCTURED TEXTS

机译:从非结构化文本构建的查询数据库中的隐私保存

摘要

A computer-implemented method of generating a queryable database (109). The method receives a corpus of free text documents (120) containing confidential data, the free text documents being related to the same domain. A trained Natural Language Processing (NLP) system (104) assigns one or more abstract named entities to each free text document in the corpus. The abstract named entities of each free text document are stored in a queryable database configured to provide aggregated information regarding the named entities. The NLP system is configured such that the abstract named entities are recognised and disambiguated with a precision between 0.75 and less than 1 and a recall between 0.75 and less than 1, and such that the ratio of precision and recall is between 0.7 and 1.3; wherein the queryable database is free from the addition of artificial noise by an artificial noise generation algorithm.
机译:一种生成查询数据库的计算机实现的方法(109)。该方法接收包含机密数据的自由文本文档(120)的语音,免费文本文档与同一域相关。培训的自然语言处理(NLP)系统(104)将一个或多个抽象命名实体分配给语料库中的每个自由文本文档。每个自由文本文档的抽象命名实体存储在查询数据库中,该数据库被配置为提供有关命名实体的聚合信息。 NLP系统被配置为使得备份命名实体被识别和消除在0.75且小于1之间的精度,并且召回在0.75且小于1之间,使得精度和召回的比率在0.7和1.3之间;其中查询数据库通过人工噪声生成算法没有添加人工噪声。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号