首页> 外文会议>2011 Eighth International Conference on Fuzzy Systems and Knowledge Discovery >Automatically locating salutation and signature blocks in emails
【24h】

Automatically locating salutation and signature blocks in emails

机译:自动在电子邮件中定位称呼和签名块

获取原文

摘要

This paper focuses on the problem of automatically locating salutation and signature blocks in the body of plain-text emails. Texts of salutation and signature block in an email usually contain identity information about the email's sender or recipients. The analysis of locating and extracting salutation and signature blocks from emails has many potential applications, such as entity attributes extracting, person entity based email social network analysis, anonymization of email corpora, improving automatic content-based email classifiers and email threading. Our approach is based on the statistical method and the rules restricted method, which can greatly improve the locating efficiency and at the same time promise a relatively high accuracy of the extracted blocks. We use the statistical method to roughly estimate the number of lines in salutation and signature blocks, and introduce some restriction rules to refine the lines located by the statistical method. Results on the public subset of the Enron corpus prove the high performance of our approach with the average F1 value above 94%.
机译:本文关注的是在纯文本电子邮件正文中自动定位称呼和签名块的问题。电子邮件中的称呼和签名文本通常包含有关电子邮件发件人或收件人的身份信息。从电子邮件中查找和提取称呼和签名块的分析具有许多潜在的应用程序,例如实体属性提取,基于个人实体的电子邮件社交网络分析,电子邮件语料库匿名化,改进基于内容的自动电子邮件分类器和电子邮件线程。我们的方法基于统计方法和规则约束方法,可以大大提高定位效率,同时保证提取出的块具有较高的精度。我们使用统计方法粗略估计称呼和签名块中的行数,并引入一些限制规则以细化通过统计方法定位的行。安然语料库的公共子集上的结果证明了我们方法的高性能,其平均F1值高于94%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号