针对云存储服务中存在的用户隐私保护需求,提出了一种在密文状态下的文档相似度计算方法.数据拥有者将文档ID、加密后的文档密文以及文档simhash值的密文上传到云服务器中;云服务提供者进行待计算相似度文档的simhash密文值和数据拥有者文档simhash密文值的全同态加法运算,获得文档间汉明距离的密文;数据拥有者解密汉明距离密文获得文档相似度排序结果.云端在不获悉数据内容及其simhash明文的情况下完成数据对象相似度运算,保护了数据隐私.给出了该方法的详细过程及相关的实验数据,验证了该方法的可行性.%In order to preserve user privacy in cloud storage services,we propose a method for calculating the similarity of documents under the ciphertext environment.After the data owner uploads the document ID,the ciphertext of document and the ciphertext of document simhash to Cloud servers,the cloud server performs fully homomorphic addition operations on the simhash ciphertext of the document whose similarity is expected and the simhash ciphertext of the data owner's document.Then the ciphertext of the Hamming distance between documents is obtained.The data owner can get the results of document similarity ranking by decrypting the ciphertext of the Hamming distance.The goal of privacy preservation can be achieved by this method because the cloud server can complete similarity calculation without any plaintext information,neither the document text nor its simhash value.We explain the proposed method in detail and the related experimental data verify its feasibility and correctness.
展开▼