首页> 外文会议>IEEE International Conference on Machine Learning and Applications >Does Semantic Search Performs Better than Lexical Search in the Task of Assisting Legal Opinion Writing?
【24h】

Does Semantic Search Performs Better than Lexical Search in the Task of Assisting Legal Opinion Writing?

机译:在协助法律意见书撰写的任务中,语义搜索的性能是否比词汇搜索更好?

获取原文

摘要

Many of the criminal cases analysed by the Prosecution Office of the Federal District and Territories are repetitive and processing them can be streamlined by providing similar previous cases as template. We investigate the use of information retrieval techniques to enable automated identification of similar cases and evaluate if semantic search performs better than lexical search in the task of assisting legal opinion writing. As a proof of concept, syntactic indexing (TF-IDF and BM25) and semantic indexing (Latent Semantic Indexing - LSI and Latent Dirichlet Allocation - LDA) techniques were evaluated using document collections from two public prosecutors offices. In addition, we evaluate model enrichment with the use of recorded data about the cases, and also with the legal norm citations observed in documents. Baseline document collections sampled from full document collection from two public prosecutors offices were used for model evaluation utilizing Normalized Discounted Cumulated Gain (NDCG) as metric. We conclude that there is no significant performance difference between semantic and syntactic indexing techniques. In addition, we observe no significant performance gain with model enrichment. We chose the BM25 technique as more adequate because it has a good balance between performance and simplicity.
机译:联邦区和领地检察院分析的许多刑事案件都是重复性的,可以通过提供类似以前的案件作为模板来简化处理。我们调查了信息检索技术的使用,以实现对类似案件的自动识别,并评估在协助法律意见书撰写的任务中语义搜索是否比词汇搜索更好。作为概念的证明,使用来自两个检察官办公室的文件收集对语法索引(TF-IDF和BM25)和语义索引(潜在语义索引-LSI和潜在狄利克雷分配-LDA)技术进行了评估。此外,我们使用有关案例的记录数据以及文档中观察到的法律规范引文来评估模型的丰富性。从两个检察官办公室收集的完整文档中抽取的基线文档收集用于模型评估,并使用标准化贴现累计收益(NDCG)作为度量标准。我们得出结论,语义和句法索引技术之间没有显着的性能差异。此外,我们发现模型丰富化并没有显着的性能提升。我们选择BM25技术更为合适,因为它在性能和简单性之间取得了良好的平衡。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号