首页> 外文会议>IEEE International Conference on Machine Learning and Applications >Does Semantic Search Performs Better than Lexical Search in the Task of Assisting Legal Opinion Writing?
【24h】

Does Semantic Search Performs Better than Lexical Search in the Task of Assisting Legal Opinion Writing?

机译:在协助法律意见写作的任务中,语义搜索是否优于词法搜索?

获取原文

摘要

Many of the criminal cases analysed by the Prosecution Office of the Federal District and Territories are repetitive and processing them can be streamlined by providing similar previous cases as template. We investigate the use of information retrieval techniques to enable automated identification of similar cases and evaluate if semantic search performs better than lexical search in the task of assisting legal opinion writing. As a proof of concept, syntactic indexing (TF-IDF and BM25) and semantic indexing (Latent Semantic Indexing - LSI and Latent Dirichlet Allocation - LDA) techniques were evaluated using document collections from two public prosecutors offices. In addition, we evaluate model enrichment with the use of recorded data about the cases, and also with the legal norm citations observed in documents. Baseline document collections sampled from full document collection from two public prosecutors offices were used for model evaluation utilizing Normalized Discounted Cumulated Gain (NDCG) as metric. We conclude that there is no significant performance difference between semantic and syntactic indexing techniques. In addition, we observe no significant performance gain with model enrichment. We chose the BM25 technique as more adequate because it has a good balance between performance and simplicity.
机译:联邦地区和领土检察机关分析的许多刑事案件都是重复和处理它们,可以通过在模板中提供类似的之前的情况来简化。我们调查使用信息检索技术来实现类似情况的自动识别,并且如果在协助法律意见书写的任务中的词汇搜索,则评估语义搜索是否优于词法搜索。作为概念证明,使用来自两名公共检察官办公室的文件收集评估了句法索引(TF-IDF和BM25)和语义索引(潜在语义索引 - LSI和潜在Dirichlet分配 - LDA)技术。此外,我们通过使用有关案件的记录数据来评估富集的富集,以及在文件中观察到的法律规范。从两个公共检察官办公室的完整文件集合中取样的基线文件集合用于模型评估,利用标准化的折扣累积增益(NDCG)作为指标。我们得出结论,语义与句法索引技术之间没有显着的性能差异。此外,我们遵守富集的模型富集的显着性能。我们选择BM25技术更适当,因为它在性能和简单之间具有良好的平衡。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号