首页> 外文学位 >Bridging Distinct Domains in Privacy Related Learning Problems
【24h】

Bridging Distinct Domains in Privacy Related Learning Problems

机译:桥接与隐私相关的学习问题中的不同领域

获取原文
获取原文并翻译 | 示例

摘要

Development of efficient and effective machine learning methods has prompted a surge of research on their application from use in spam filtering to recommender systems. Blindly applying machine learning tools to learning problems in privacy and security, however, does not often produce the desired results. Applications of machine learning in privacy and security are often affected by this difference as adversaries are ordinarily present and training data with reliable ground truth is frequently difficult to obtain. This problem is exacerbated by the fact that data used for testing methods may differ from the real world data that the model is created for. This thesis addresses three learning problems in privacy and security, all of which have data from different domains that needs to be considered.;In authorship attribution we tackle the cross-domain case in which the training data and testing data are written in different contexts and mediums. Research in this area has been limited to texts written in the same domain, an assumption that cannot often be made in real world settings. We explore cross-domain attribution in three such domains: blogs, Twitter feeds, and Reddit comments.;Research in website fingerprinting focuses on a single domain, the incoming and outgoing packets on a network, to determine which webpage a user is visiting. In addition to this domain, we focus on the websites themselves and develop methods that successfully determine which website level features cause a site to be more or less susceptible to this type of attack.;Similarly, most research on the economies and structure of cybercriminal forums focuses on only the domain of private messages. While there is some research that has investigated what can be learned from the public interactions on these forums, no work has tried to bridge these domains. We present a method to predict which public threads are likely to trigger private interactions.
机译:高效,有效的机器学习方法的发展促使人们对其应用进行了大量研究,从垃圾邮件过滤到推荐系统。但是,盲目地使用机器学习工具来学习隐私和安全性问题通常不会产生期望的结果。机器学习在隐私和安全性中的应用通常会受到这种差异的影响,因为通常会出现对手,并且通常难以获得具有可靠的地面真实性的训练数据。用于测试方法的数据可能与为模型创建的实际数据不同,这一事实使问题更加严重。本论文解决了三个关于隐私和安全的学习问题,所有这些学习问题都需要考虑来自不同领域的数据。在作者归因中,我们解决了跨领域的情况,在该情况下,培训数据和测试数据是在不同的上下文中编写的;媒介。该领域的研究仅限于在同一领域撰写的文本,这种假设通常无法在现实环境中进行。我们在以下三个域中探索跨域归因:博客,Twitter提要和Reddit评论。网站指纹研究主要集中在单个域,网络上的传入和传出数据包,以确定用户正在访问哪个网页。除此领域外,我们还将重点放在网站本身上,并开发能够成功确定哪些网站级别特征导致网站或多或少易受此类攻击的方法。类似地,大多数关于网络犯罪论坛的经济性和结构的研究仅关注私人消息的域。尽管有一些研究调查了可以从这些论坛上的公众互动中学到的知识,但没有工作试图在这些领域之间架起桥梁。我们提出了一种预测哪些公共线程可能触发私有交互的方法。

著录项

  • 作者

    Overdorf, Rebekah.;

  • 作者单位

    Drexel University.;

  • 授予单位 Drexel University.;
  • 学科 Computer science.;Artificial intelligence.
  • 学位 Ph.D.
  • 年度 2017
  • 页码 127 p.
  • 总页数 127
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号