首页> 外文会议>International Conference on Fuzzy Systems and Knowledge Discovery >Automatically locating salutation and signature blocks in emails
【24h】

Automatically locating salutation and signature blocks in emails

机译:在电子邮件中自动定位称呼和签名块

获取原文

摘要

This paper focuses on the problem of automatically locating salutation and signature blocks in the body of plain-text emails. Texts of salutation and signature block in an email usually contain identity information about the email's sender or recipients. The analysis of locating and extracting salutation and signature blocks from emails has many potential applications, such as entity attributes extracting, person entity based email social network analysis, anonymization of email corpora, improving automatic content-based email classifiers and email threading. Our approach is based on the statistical method and the rules restricted method, which can greatly improve the locating efficiency and at the same time promise a relatively high accuracy of the extracted blocks. We use the statistical method to roughly estimate the number of lines in salutation and signature blocks, and introduce some restriction rules to refine the lines located by the statistical method. Results on the public subset of the Enron corpus prove the high performance of our approach with the average F1 value above 94%.
机译:本文重点介绍了在纯文本电子邮件的身体中自动定位称呼和签名块的问题。电子邮件中称呼和签名块的文本通常包含有关电子邮件的发件人或收件人的身份信息。从电子邮件中定位和提取致敬和签名块的分析具有许多潜在的应用程序,例如实体属性提取,基于人格的电子邮件社交网络分析,电子邮件的匿名化,改进了基于自动内容的电子邮件分类器和电子邮件线程。我们的方法基于统计方法和规则限制方法,可以大大提高定位效率,同时承诺提取块的相对高的精度。我们使用统计方法粗略地估计称呼和签名块的行数,并介绍一些限制规则来改进统计方法所在的线条。结果对西润语料库的公共子集证明了我们的方法的高性能,平均F1值高于94%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号