Hibrid approach to processing incomplete stream data in distributed real-time systems

Y. Zhyliuk, V.L. Pleskach

Abstract


The article considers the problem of processing incomplete streaming data in distributed real-time systems, in particular in the context of data mining. It is noted that traditional methods of imputation are ineffective in conditions of limited resources, high requirements for processing speed and dynamic nature of streams. A hybrid approach combining federated learning, contextual imputation and adaptation to conceptual drift is proposed. The method allows local distributed computing nodes to train lightweight imputation models on their own data, followed by centralised aggregation, backpropagation of the global model and its dynamic updating. Experimental verification on a real dataset has shown the advantages of the approach in terms of accuracy (RMSE, MAE) and network load compared to the baseline methods. The obtained results prove the effectiveness of the proposed method in distributed environments with limited computing resources.

Prombles in programming 2025; 2: 112-121


References


Handling missing values in data streams: An overview. Afonso Lima, Elaine P. M. de Sousa, 2024.

Distributed Data and Immersive Collaboration. Daniel Reed, Roscoe Giles, Charles E. Catlett, 1997.

Missing Data Imputation: A Comprehensive Review. Journal of Computer and Communications. Alwateer, M. , Atlam, E. , El-Raouf, M. , Ghoneim, O. and Gad, I., 2024.

A Comprehensive Review of Handling Missing Data: Exploring Special Missing Mechanisms. Youran Zhou, Sunil Aryal, 2024.

REAL-TIME SYSTEMS Design Principles for Distributed Embedded Applications. Hermann Kopetz, 1997.

Efficient Join Processing Over Incomplete Data Streams (Technical Report). Weilong Ren, Xiang Lian, Kambiz Ghazinour, 2019.

Emerging Issues in Data Storytelling CHAPTER 1 | The Challenges of Working With Incomplete Data Sets [Online]– Availa ble from: https://www.icom.org/publicati ons/data-storytelling/the-challenges-of-work ing-with-incomplete-data-sets (Accessed 28.04.2025).

Analyzing Incomplete Political Science Data: An Alternative Algorithm for Multiple Impu tation. GARY KING, JAMES HONAKER, ANNE JOSEPH, KENNETH SCHEVE, 2001.

Chukwuemeka Obasi, Victor Oisamoje, Braimoh Ikharo. Security in Distributed Sys tem: A Review Perspective, 2022. 1

Time-Sensitive Networking [Online]– Availa ble from: https://campaign.advan tech.online/en/global/solutions/intelligent transportation-systems/resources/white-pa pers/Time-Sensitive-Networking.pdf , (Accessed 28.04.2025).

Concept Drift [Online]– Available from: https://www.iguazio.com/glossary/concept drift/ , (Accessed 28.04.2025). 12. Continuous Inspection Schemes. Biometrika 41. E. S. Page. 1954.


Refbacks

  • There are currently no refbacks.