首页> 外文学位 >Naive Bayes and similarity based methods for identifying computer users using keystroke patterns.

【24h】

Naive Bayes and similarity based methods for identifying computer users using keystroke patterns.

机译：朴素贝叶斯和基于相似度的使用击键模式识别计算机用户的方法。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this dissertation, we present two methods for identifying computer users using keystroke patterns. In the first method "Competition between naive Bayes models for user identification," a naive Bayes model is created for each user. In the training phase of this method, the model of a user is trained using maximum likelihood estimation on the key press latency values extracted from the texts typed by the user. In the user identification phase of this method, for each user we determine the probabilistic likelihood that the typed text belongs to a user. Finally, the typed text is assigned to the user with the highest likelihood value. In the second method "Similarity based user identification," each user is represented by a distinct model. In the training phase of this method, the model parameters of a user are estimated using the extracted key press latency values from the texts typed by the user. In the user identification phase of this method, we assign a similarity score to each user given a typed text. The similarity score of a user is determined by finding the ratio between (1) the number of key press latency values extracted from the typed text similar to the estimated model parameters of the user and (2) the total number of key press latency values extracted from the typed text. Finally, the typed text is assigned to the user with the highest similarity score.;We also present a novel application of distance based outlier detection method for discarding outliers in the extracted key press latency values from a users' typed text. Outliers are detected using the following three-step procedure: (1) for each extracted latency value xi, a neighborhood region using a distance threshold is created, (2) a latency value xj is considered as a neighbor of xi if xj falls in the neighborhood region of xi and (3) the latency value xi is considered as an outlying value if the number of neighbors determined for xi are less than a pre-set threshold.;To empirically evaluate the performance of our proposed work, a keystroke data set was collected from ten users, where each user provided 15 typing samples. From the provided typing samples, six distinct datasets were created in which the number of user identification attempts varied from 150 to 54600. Results on the datasets indicate that the identification accuracy of the "Competition between naive Bayes models for user identification method" ranges from 89.62% to 99.65% and the identification accuracy of the "Similarity based user identification method" ranges from 96.33% to 100%. Further, the performance of our proposed two user identification methods is compared with the performance of two user identification methods reported in the recent literature.;To further improve the performance of the user identification methods, we theoretically analyze Majority Voting Rule (MVR) based fusion of two or more user identification methods. We formulate a procedure for theoretically estimating the identification accuracy of the MVR based fusion of user identification methods. Our proposed procedure, unlike the procedure presented in the literature of MVR based fusion, does not assume that the methods to be fused have the identical identification accuracy. The theoretically estimated identification accuracy of the MVR based fusion of user identification methods is analyzed in the light of empirical results.

机译：本文提出了两种利用击键模式识别计算机用户的方法。在第一种方法“用于用户标识的朴素贝叶斯模型之间的竞争”中，为每个用户创建了朴素贝叶斯模型。在此方法的训练阶段，使用最大似然估计对从用户键入的文本中提取的按键等待时间值进行训练，从而对用户的模型进行训练。在此方法的用户识别阶段，对于每个用户，我们确定键入的文本属于用户的概率可能性。最后，将键入的文本分配给具有最高似然值的用户。在第二种方法“基于相似性的用户标识”中，每个用户都由不同的模型表示。在此方法的训练阶段，使用从用户键入的文本中提取的按键等待时间值来估算用户的模型参数。在此方法的用户识别阶段，我们给给定键入文本的每个用户一个相似性评分。通过找到（1）从类似于用户的估计模型参数的键入文本中提取的按键等待时间值的数量与（2）提取的按键等待时间值的总数之间的比率来确定用户的相似性得分从键入的文本。最后，将键入的文本分配给具有最高相似度分数的用户。我们还提出了一种基于距离的离群值检测方法的新颖应用，该方法用于丢弃从用户键入的文本中提取的按键等待时间值中的离群值。使用以下三步过程检测离群值：（1）对于每个提取的等待时间值xi，创建使用距离阈值的邻域；（2）如果xj落在xi中，则将等待时间值xj视为xi的邻居。 xi的邻域和（3）如果为xi确定的邻居数小于预设阈值，则将等待时间值xi视为离群值。为了实证评估我们提出的工作的性能，请输入击键数据集是从十个用户那里收集的，每个用户提供了15个打字样本。从提供的类型样本中，创建了六个不同的数据集，其中用户识别尝试的次数从150到54600不等。数据集上的结果表明，“用于用户识别方法的朴素贝叶斯模型之间的竞争”的识别精度范围为89.62。％到99.65％，“基于相似性的用户识别方法”的识别准确度在96.33％到100％之间。此外，将我们提出的两种用户识别方法的性能与最近文献中报道的两种用户识别方法的性能进行了比较。为了进一步提高用户识别方法的性能，我们在理论上分析了基于多数投票规则（MVR）的融合两种或多种用户识别方法。我们制定了一个程序，用于从理论上估计基于MVR的用户识别方法融合的识别准确性。与基于MVR的融合文献中介绍的过程不同，我们提出的过程并不假定要融合的方法具有相同的识别精度。根据经验结果，分析了基于MVR的用户识别方法融合的理论估计的识别准确性。

著录项

作者
Joshi, Shrijit S.;
展开▼
作者单位

Louisiana Tech University.;

展开▼
授予单位 Louisiana Tech University.;
学科 Computer Science.
学位 Ph.D.
年度 2009
页码 126 p.
总页数 126
原文格式 PDF
正文语种 eng
中图分类
关键词
入库时间 2022-08-17 11:37:49

相似文献

外文文献
中文文献
专利

1. One-class naive Bayes with duration feature ranking for accurate user authentication using keystroke dynamics [J] . Ho Jiacang, Kang Dae-Ki Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies . 2018,第6期

机译：具有持续时间的一流的天真贝叶斯，使用击键动态排名为准确的用户身份验证
2. Performance Analysis of Naive Bayes Classifier Over Similarity Score-Based Techniques for Missing Link Prediction in Ego Networks [J] . Gupta Anand Kumar, Sardana Neetu Journal of information technology research . 2021,第1期

机译：天真贝叶斯分类器对基于相似性得分的性能分析，在自我网络中缺少链路预测的基于相似性的基础技术
3. Multinomial Naive Bayes using similarity based conditional probability [J] . Santhi B., Brindha G. R. Journal of intelligent & fuzzy systems: Applications in Engineering and Technology . 2019,第2期

机译：使用基于相似性的条件概率的多项式朴素贝叶斯
4. Karawo Motifs Identification based on The Classification of User Characters with Naive Bayes Method [C] . M H Koniyo, S Lamusu, L Hadjaratie Annual Applied Science and Engineering Conference . 2017

机译：Karawo Motifs识别基于使用Naive Bayes方法的用户角色的分类
5. Soar CGFs that learn inductively: A hybrid autonomous approach based on a modified naive Bayes learning algorithm. [D] . Chia, Chien Wei. 2003

机译：腾飞的CGF可以归纳学习：一种基于改进的朴素贝叶斯学习算法的混合自主方法。
6. A Feature-Driven Decision Support System for Heart Failure Prediction Based on χ2 Statistical Model and Gaussian Naive Bayes [O] . Liaqat Ali, Shafqat Ullah Khan, Noorbakhsh Amiri Golilarz, 2019

机译：基于χ2统计模型和高斯朴素贝叶斯的特征驱动心衰预测决策支持系统
7. Investigation of keyboard digraphs informational parameters for keystroke-based identification tasks of computer networks users [O] . Д.Ю. Горелов, О.О. Іванова, О.В. Кокорін, 2020

机译：基于击键的基于击键的键盘信息参数的键盘上的信息参数进行调查

Naive Bayes and similarity based methods for identifying computer users using keystroke patterns.

摘要

著录项

相似文献

相关主题

期刊订阅