【24h】

Near-duplicate detection for eRulemaking

机译:用于电子规则制作的近重复检测

获取原文
获取原文并翻译 | 示例

摘要

U.S. regulatory agencies are required to solicit, consider, and respond to public comments before issuing regulations. In recent years, agencies have begun to accept comments via both email and Web forms. The transition from paper to electronic comments makes it much easier for individuals to customize "form" letters, which they do, creating "near-duplicate" comments that express the same viewpoint in slightly different languages. This paper explores the use of simple text clustering and retrieval algorithms for identifying near-duplicate public comments. Experiments with public comments about a recent regulation proposed by the Environmental Protection Agency (EPA) demonstrate the effectiveness of the algorithms.
机译:在发布法规之前,美国监管机构必须征询,考虑并回应公众意见。近年来,代理商已开始通过电子邮件和Web表格接受评论。从纸质评论到电子评论的转变使个人更容易自定义他们自己的“形式”字母,从而创建了“几乎重复”的评论,这些评论用略有不同的语言表达了相同的观点。本文探讨了使用简单的文本聚类和检索算法来识别几乎重复的公共评论。公众对环保署(EPA)提出的最新法规进行评论的实验证明了该算法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号