Publication Type

Journal Article

Publication Date (Issue Year)

2024

Journal Name

Journal of Physics Communications

Abstract

Machine learning classification algorithms have been extensively utilized in addressing user authentication challenges. Nonetheless, a majority of solutions categorize users into three classes, whereas adaptive authentication scenarios necessitate classification beyond this threshold. The rationale behind this limitation has not been thoroughly explored. The current study leveraged the Naive Bayes theorem for user authentication endeavors to assess the risk associated with login attempts. The Naive Bayes Machine Learning algorithm, along with its variations such as Gaussian, Categorical, and Bernoulli, was applied on both weighted and unweighted datasets to ascertain risk levels and categorize them into six classes. Additionally, the classification task was executed using alternative algorithms. The outcomes of cross-validation and comparative analyses revealed that the performance was commendable for up to three classes, after which a decrease was observed in certain Naive Bayes and SVM classifiers. Among the Naïve Bayes family, the Bernoulli NB algorithm exhibited superior performance but was surpassed by Decision Trees, SVM, XGB, and Random Forests. Notably, the weighted dataset consistently outperformed the unweighted counterpart, with the allocation of weights significantly influencing algorithmic efficacy. The 80:20 split strategy yielded the most favorable outcomes in contrast to the 70:30 and 60:40 splits, albeit no significant variances were detected during cross-validation. Non-Naïve Bayes algorithms demonstrated superior performance compared to Naïve Bayes algorithms. For Naïve Bayes, optimal performance is achieved with three classes, highlighting its utility in conditional risk calculation, while non-Naïve Bayes multi-classification algorithms are more suitable for classification tasks due to the problem’s inherent compatibility with conditional probabilities. In conclusion, it is imperative to acknowledge that the characteristics of the data, the use of weights, and the data splitting methodology significantly influence the accuracy of machine learning algorithms in multi-class user classification

Keywords

naïve bayes, multi-user classification, adaptive authentication

Rsif Scholar Name

Prudence Munyaradzi Mavhemwa

Rsif Scholar Nationality

Zimbabwe

Cohort

Cohort 2

Thematic Area

ICTs Including Big Data and Artificial Intelligence

Africa Host University (AHU)

University of Rwanda (UR), Rwanda

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.