Adaptive ensemble decision integration for indicator of resource security: methodology and statistical validation of stability

O.P. Ilyina, S.Ya. Skybyk

Abstract


Ensuring effective decision support in complex distributed organizational systems (especially in national security and defense planning) requires reliable classification methods capable of rapid diagnosis of resource states and risks to strategic interests. The effectiveness of a resource security indicator (RSI) built on machine learning methods critically depends on the stability and reliability of integrated predictions under conditions typical of this domain: significant class imbalance (where missing a negative state is critical), limited data volume, log normal feature distribution with "long tails", and noise components that reduce the stability of individual classifiers. To address these challenges, an adaptive ensemble integration mechanism (RSI) was developed, implementing weighted soft voting of models (NB, SVM, RF, kNN, LR) with unified probability calibration. The central element is a composite dynamic quality metric (KQ), which combines 𝐹𝐹1𝑛𝑛𝑛𝑛𝑛𝑛 (prioritizing the minority class), 𝑀𝑀𝑀𝑀𝑀𝑀, and 𝐾𝐾𝐾𝐾𝐾𝐾𝐾𝐾𝐾𝐾𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛, adapting their weights based on correlation. Trust coefficients (KDR) are integrated to adjust the influence of models depending on their vulnerability to data properties. Algorithm validation was performed on synthetic data simulating log-normal distribution and lag effects of real-world conditions. A large-scale experiment (250 runs, paired design) confirmed high statistical significance (𝐾𝐾 < 0.001 by Wilcoxon test) of RSI superiority over the best single classifier (Random Forest) across all metrics (Δ𝐾𝐾𝐾𝐾, Δ𝐹𝐹1𝑛𝑛𝑛𝑛𝑛𝑛, Δ𝑅𝑅𝑅𝑅𝑅𝑅𝐾𝐾𝑅𝑅𝑅𝑅𝑛𝑛𝑛𝑛𝑛𝑛). The effect size (Cohen's 𝑑𝑑 ≥ 1.41) indicates large practical value. The results demonstrate that adaptive integration ensures stability and reliability of risk diagnosis, critically necessary for security applications.

Problems in programming 2025; 4: 88-101


Keywords


machine-learning classification; ensemble training; adaptive quality metric; class imbalance; soft voting; statistical validation; strategic decision support

References


Skybyk S., Doroshenko A., Ilyina O., Sinitsyn I. Machine-Learning-Based Model for Indicators of the Resource-Based Security of Systems // Proceedings of the 9th International Scientific and Practical Conference Applied Information Systems and Technologies in the Digital Society (AISTDS 2025). CEUR Workshop Proceedings. 2025. Vol. 4133. pp. 153–169.

Kress M. Operational Logistics: The Art and Science of Sustaining Military Operations. 2nd ed. Switzerland: Springer International Publishing, 2016. 313 P.

Sinitsyn I. P., Shevchenko V. L., Doroshenko A. Yu., Fedorenko R. M. Models and software systems for defence resource management: monograph. Kyiv: IPS of NASU, 2024. 268 P.

Large J., Lines J., Bagnall A. A probabilistic classifier ensemble weighting scheme based on exponentially weighting the probability estimates // Data Mining and Knowledge Discovery. 2019.

Adam S. P., Alexandropoulos S.-A. N., Pardalos P. M., Vrahatis M. N. No Free Lunch Theorem: A review // In: Demetriou I. C., Pardalos P. M. (eds.) Approximation and Optimization. Springer Optimization and Its Applications, vol. 145. Switzerland: Springer, 2019. pp. 57–82.

Wright M. N., Ziegler A. ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R // Journal of Statistical Software. 2017. Vol. 77, No. 1.

Kuhn M., Johnson K. Feature Engineering and Selection: A Practical Approach for Predictive Models. Boca Raton, FL: CRC Press, 2019. 600 P.

Kuncheva L. I. Combining Pattern Classifiers: Methods and Algorithms. Wiley, 2004. 312 p. 9. Kull M., Silva F. A., Flach P. Beyond Sigmoids: How to Obtain Well-Calibrated Probabilities from Binary Classifiers // Machine Learning. 2017. Vol. 106, No. 3. pp. 437–451.

Naeini M.P., Cooper G.F., Hauskrecht M. Obtaining Well Calibrated Probabilities Using Bayesian Binning // Proceedings of the AAAI Conference on Artificial Intelligence. 2015.

Aggarwal C. C. Data Mining: The Textbook. Springer, 2015. 734 P.

He H., Ma Y. Imbalanced Learning: Foundations, Algorithms, and Applications. Wiley, 2013.

Branco P., Torgo L., Ribeiro R.P. A Survey of Predictive Modeling Under Imbalanced Distributions // ACM Computing Surveys. 2015. Vol. 49, No. 2.

Krawczyk B. Learning from Imbalanced Data: Open Challenges and Future Directions // Progress in Artificial Intelligence. 2016. Vol. 5, No. 4. pp. 221–232.

Demšar J. Statistical Comparisons of Classifiers over Multiple Data Sets // Journal of Machine Learning Research. 2006. Vol. 7. P. 1–30. 16. Iooss B., Lemaître P. A Review on Global Sensitivity Analysis Methods // In: Dellino G., Meloni C. (eds.) Uncertainty Management in Simulation-Optimization of Complex Systems. Boston, MA: Springer, 2015. pp. 101–122.


Refbacks

  • There are currently no refbacks.