Anonymizing Temporal Phrases in Natural Language Text to be Posted on Social Networking Services

机译：将自然语言文本中的时间短语匿名化将发布在社交网络服务上

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Time-related information in text posted on-line is one type of personal information targeted by attackers, one reason that sharing information online can be risky. Therefore, time information should be anonymized before it is posted on social networking services. One approach to anonymizing information is to replace sensitive phrases with anonymous phrases, but attackers can usually spot such anonymization due to its unnaturalness. Another approach is to detect temporal passages in the text, but removal of these passages can make the meaning of the text unnatural. We have developed an algorithm that can be used to anonymize time-related personal information by removing the temporal passages when doing so will not change the natural meaning of the message. The temporal phrases are detected by using machine-learned patterns, which are represented by a subtree of the sentence parsing tree. The temporal phrases in the parsing tree are distinguished from other parts of the tree by using temporal taggers integrated into the algorithm. In an experiment with 4008 sentences posted on a social network, 84.53 % of them were anonymized without changing their intended meaning. This is significantly better than the 72.88 % rate of the best previous temporal phrase detection algorithm. Of the learned patterns, the top ten most common ones were used to detect 87.78% the temporal phrases. This means that only some of the most common patterns can be used to the anonymize temporal phrases in most messages to be posted on an SNS. The algorithm works well not only for temporal phrases in text posted on social networks but also for other types of phrases (such as location and objective ones), other areas (religion, politics, military, etc.), and other languages.

机译：在线发布的文本中与时间相关的信息是攻击者针对的一种个人信息，这是在线共享信息存在风险的原因之一。因此，在将时间信息发布到社交网络服务之前，应先对其进行匿名处理。信息匿名化的一种方法是用匿名短语替换敏感短语，但是攻击者通常会因其不自然而发现这种匿名化。另一种方法是检测文本中的时间段落，但是删除这些段落会使文本的含义不自然。我们已经开发了一种算法，该算法可用于通过删除时间段来匿名化与时间相关的个人信息，而这样做不会改变消息的自然含义。通过使用机器学习的模式来检测时间短语，该机器学习的模式由句子解析树的子树表示。通过使用集成到算法中的时间标记器，可以将解析树中的时间短语与树的其他部分区分开。在一个实验中，在社交网络上发布了4008个句子，其中84.53％的句子是匿名的，没有改变其预期的含义。这明显优于最佳的先前时间短语检测算法的72.88％的比率。在学习的模式中，最常见的前十种模式用于检测87.78％的时态短语。这意味着仅某些最常见的模式可用于匿名化大多数要发布在SNS上的消息中的时间短语。该算法不仅适用于社交网络上发布的文本中的时间短语，而且还适用于其他类型的短语（例如位置和目标短语），其他领域（宗教，政治，军事等）和其他语言。

著录项

来源
《International workshop on digital-forensics and watermarking》|2014年|437-451|共15页
会议地点
作者
Hoang-Quoc Nguyen-Son; Anh-Tu Hoang; Minh-Triet Tran; Hiroshi Yoshiura; Noboru Sonehara; Isao Echizen;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Anonymization; Temporal phrase deletion; Social networking service;

机译：匿名化;时间短语删除;社交网络服务;

相似文献

外文文献
中文文献
专利

1. Anonymizing Personal Text Messages Posted in Online Social Networks and Detecting Disclosures of Personal Information [J] . Hoang-Quoc NGUYEN-SON, Minh-Triet TRAN, Hiroshi YOSHIURA, IEICE transactions on information and systems . 2015,第1期

机译：匿名发布在在线社交网络中的个人短信并检测个人信息的泄露
2. A New Primitive Structure for De-Anonymization Attack in Anonymized Social Networks [J] . K.H.Gayathri, B.Venkateswarlu International Journal of Computer Trends and Technology . 2015,第2期

机译：匿名社交网络中去匿名攻击的新原始结构
3. Evaluation of inductive logic programming for information extraction from natural language texts to support spatial data recommendation services [J] . Domen Smole, Marjan Ceh, Tomaz Podobnikar International Journal of Geographical Information Science . 2011,第10a12期

机译：评估归纳逻辑编程以从自然语言文本中提取信息以支持空间数据推荐服务
4. Anonymizing Temporal Phrases in Natural Language Text to be Posted on Social Networking Services [C] . Hoang-Quoc Nguyen-Son, Anh-Tu Hoang, Minh-Triet Tran, Interntaional Workshop on Digital-Forensics and Watermarking . 2014

机译：在社交网络服务上匿名以自然语言文本中的时间短语发布
5. Anonymization and De-anonymization Attacks in Online Social Networks [D] . Zhang, Cheng. 2020

机译：在线社交网络中的匿名化和解除匿名化攻击
6. Evaluating the Validity of Simplified Chinese Version of LIWC in Detecting Psychological Expressions in Short Texts on Social Network Services [O] . Nan Zhao, Dongdong Jiao, Shuotian Bai, -1

机译：评估LIWC简体中文版本在检测社交网络服务中短文本中的心理表达时的有效性
7. Anonymizing Personal Text Messages Posted in Online Social Networks and Detecting Disclosures of Personal Information [O] . Hoang-Quoc NGUYEN-SON, Minh-Triet TRAN, Hiroshi YOSHIURA, 2015

机译：匿名的个人短信发布在在线社交网络和检测个人信息的披露中

Anonymizing Temporal Phrases in Natural Language Text to be Posted on Social Networking Services

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅