首页> 外文会议>ACM SIGKDD international conference on Knowledge discovery in data mining >Scalable discovery of hidden emails from large folders
【24h】

Scalable discovery of hidden emails from large folders

机译:可扩展地发现大文件夹中的隐藏电子邮件

获取原文

摘要

The popularity of email has triggered researchers to look for ways to help users better organize the enormous amount of information stored in their email folders. One challenge that has not been studied extensively in text mining is the identification and reconstruction of hidden emails. A hidden email is an original email that has been quoted in at least one email in a folder, but does not present itself in the same folder. It may have been (un)intentionally deleted or may never have been received. The discovery and reconstruction of hidden emails is critical for many applications including email classification, summarization and forensics. This paper proposes a framework for reconstructing hidden emails using the embedded quotations found in messages further down the thread hierarchy. We evaluate the robustness and scalability of our framework by using the Enron public email corpus. Our experiments show that hidden emails exist widely in that corpus and also that our optimization techniques areeffective in processing large email folders.
机译:电子邮件的普及促使研究人员寻找帮助用户更好地组织其电子邮件文件夹中存储的大量信息的方法。在文本挖掘中尚未广泛研究的一项挑战是隐藏电子邮件的识别和重构。隐藏电子邮件是原始电子邮件,已在一个文件夹中的至少一封电子邮件中被引用,但未在同一文件夹中显示。它可能已被(无意间)删除,或者可能从未收到。隐藏电子邮件的发现和重建对于许多应用程序至关重要,包括电子邮件分类,摘要和取证。本文提出了一个框架,该框架使用在线程层次结构中更深的消息中找到的嵌入式引用来重建隐藏的电子邮件。我们通过使用Enron公共电子邮件语料库来评估我们框架的健壮性和可伸缩性。我们的实验表明,隐藏的电子邮件广泛存在于该语料库中,并且我们的优化技术可有效处理大型电子邮件文件夹。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号