Performance Evaluation and Benchmarking of Machine Learning Algorithms for Network Intrusion Detection: An Ensemble Approach

Network intrusion detection remains a critical line of defense in modern networks. This study evaluates and benchmarks a range of classical and modern machine-learning algorithms for network intrusion detection, and proposes ensemble strategies to improve detection rate and robustness. We perform experiments on multiple publicly available datasets covering different traffic scenarios and attack types, including KDD Cup ’99 / NSL-KDD, UNSW-NB15, CIC-IDS2017 and CSE-CIC-IDS2018. For each dataset we apply consistent preprocessing, feature engineering, and class-imbalance handling; model selection uses stratified cross-validation and hyperparameter tuning. Algorithms evaluated include Logistic Regression, SVM, k-NN, Decision Trees, Random Forest, XGBoost, LightGBM, MLPs, CNN and LSTM-based deep models, and unsupervised/anomaly detectors such as Isolation Forest and Autoencoders. We design and test ensemble strategies (bagging/voting, stacking, and hybrid ensembles combining anomaly detectors with supervised classifiers). Models are compared on detection metrics (precision, recall, F1), ROC-AUC and PR-AUC, plus operational metrics (false alarm rate, detection latency, throughput). Statistical tests (paired t-test, McNemar) establish significance. Results show ensemble stacking that blends tree-based learners and deep classifiers improve recall for minority attack classes while keeping false alarms acceptably low. We provide open experimental code, tuned hyperparameters, and guidance for deploying the most promising models in production IDS pipelines.