Machine learning based approach to analyze file meta data for smart phone file triage

Serhal Cezar; Le-Khac Nhien-An

首页> 外文期刊>Digital investigation >Machine learning based approach to analyze file meta data for smart phone file triage

【24h】

Machine learning based approach to analyze file meta data for smart phone file triage

机译：基于机器学习的方法分析智能手机文件分类的文件元数据

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

With the rapid increase in mobile phone storage capacity and penetration, digital forensic investigators face a significant challenge in quickly identifying relevant examinable files within a plethora of uninteresting OS and application files extracted by forensic tools. This challenge can have serious adverse effects in time critical cases, and can also result in increasing case backlog. A possible solution for this issue is to prioritize digital artifacts. This is referred to as triage. Several digital forensic triage methodologies based on classical automation techniques such as block hash and regular expression matching have been proposed. However, such techniques suffer from the significant limitation of requiring users to know and hardcode data templates and relations of interest. In literature, more flexible machine learning based approaches have been proposed to classify whether a mobile device, rather than a mobile device artifact, is of interest or not based on its usage metrics and file-system metadata. Also, recently an approach has been proposed and tested in triaging data generated and extracted from a computer-based operating system. However, this approach did not cover smart mobile operating system, and it did not consider key steps such as feature engineering, feature selection, and hyper-parameter tuning. Hence, in this paper, we propose a comprehensive machine learning based solutions with features extracted from file metadata to identify possible smart phone files of interest that should be examined. A range of classification algorithms are tested and their performance compared. Our classification models were trained and tested on a dataset consisting of the metadata of nearly 2 million files extracted from devices running Android OS and linked to real terrorism cases. The use of real case data allows obtaining realistic results, and restricting the operating system and case type helps narrow the experimentation scope enough to provide a proof of concept. Through our experiments, a best classifier is also identified. (C) 2021 The Authors. Published by Elsevier Ltd.

机译：随着移动电话存储容量和渗透率的快速增加，数字法医调查人员在快速识别出在法医工具提取的一个无趣的操作系统和应用程序文件中快速识别相关的考试文件时面临重大挑战。这一挑战可能在时间批评案件中具有严重的不利影响，并且还可以导致案例积压。此问题的可能解决方案是优先考虑数字工件。这被称为分类。已经提出了几种基于块散列和常规表达式匹配的经典自动化技术的数份数字法医分类方法。然而，这种技术遭受了要求用户了解和硬代码数据模板和感兴趣关系的显着限制。在文献中，已经提出了基于更灵活的基于机器学习的方法来分类移动设备是否是感兴趣的，而不是移动设备伪像，而不是基于其使用度量和文件系统元数据。此外，最近已经提出了一种方法，并在从基于计算机的操作系统中产生和提取的三环数据中进行了测试。但是，这种方法没有涵盖智能移动操作系统，并且没有考虑关键步骤，例如特征工程，特征选择和超参数调整。因此，在本文中，我们提出了一系列基于机器学习的解决方案，其中包含从文件元数据中提取的功能，以识别应该检查的可能感兴趣的智能手机文件。测试了一系列分类算法及其性能。我们的分类模型培训并在数据集上进行了测试，该数据集由从运行Android OS的设备中提取的近200万个文件的元数据组成，并链接到真正的恐怖主义案例。使用实际情况数据允许获得现实的结果，并限制操作系统和案例类型有助于缩小实验范围，以提供概念证明。通过我们的实验，还确定了最好的分类器。（c）2021作者。 elsevier有限公司出版

著录项

来源
《Digital investigation》 |2021年第suppla期|301194.1-301194.9|共9页
作者
Serhal Cezar; Le-Khac Nhien-An;
展开▼
作者单位

Univ Coll Dublin Dublin 4 Ireland;

Univ Coll Dublin Dublin 4 Ireland;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Smartphone file triage; Mobile forensic triage; Metadata; Classification; Random forest; Neural networks; Mobile phone forensics;

机译：智能手机文件分类;移动法医分类;元数据;分类;随机森林;神经网络;手机取证;

相似文献

外文文献
中文文献
专利

1. Accuracy of a smartphone application for triage of skin lesions based on machine learning algorithms [J] . Udrea A., Mitra G.D., Costea D., Journal of the European Academy of Dermatology and Venereology: JEADV . 2020,第3期

机译：基于机器学习算法的皮肤病变智能手机应用的准确性
2. Machine Learning Based Log File Data Analysis for Quality Assurance of Proton Pencil Beam Scanning [J] . Dou T., Clasie B., Depauw N., Medical Physics . 2019,第6期

机译：基于机器学习的日志文件数据分析质量保证质子铅笔扫描扫描
3. Mass Surveilance of C. elegans-Smartphone-Based DIY Microscope and Machine-Learning-Based Approach for Worm Detection [J] . Bornhorst Julia, Nustede Eike Jannik, Fudickar Sebastian Nature reviews Cancer . 2019,第6期

机译：基于杆杆线虫的DIY显微镜和基于机器学习的蠕虫检测方法的大规模监视
4. Semantic-Aware Hot Data Selection Policy for Flash File System in Android-Based Smartphones [C] . Choi Dongsoo, Shin Dongkun IEEE International Conference on Parallel and Distributed Systems . 2013

机译：基于Android的智能手机中Flash文件系统的语义感知热数据选择策略
5. A relational algebra machine based on surrogate files for very large data/knowledge bases [D] . Chung, Soon Myoung 1990

机译：基于代理文件的关系代数机，用于非常大的数据/知识库
6. An Automated Machine-Learning Approach for Road Pothole Detection Using Smartphone Sensor Data [O] . Chao Wu, Zhen Wang, Simon Hu, 2020

机译：使用智能手机传感器数据的道路坑道检测自动化机器学习方法
7. High-Performance Estimation of Lead Ion Concentration Using Smartphone-Based Colorimetric Analysis and a Machine Learning Approach [O] . Samira Sajed, Mohammadreza Kolahdouz, Mohammad Amin Sadeghi, 2020

机译：基于智能手机的比色分析和机器学习方法高性能估计铅离子浓度

Machine learning based approach to analyze file meta data for smart phone file triage

摘要

著录项

相似文献

相关主题

期刊订阅