Adaptive Chaos Engineering and AI-Driven Dependability Modeling for Resilient Cloud-Native and Safety-Critical Systems
Abstract
The increasing reliance on cloud-native architectures, serverless computing, and artificial intelligence-driven systems has introduced new complexities in ensuring system dependability, resilience, and safety. Traditional reliability engineering approaches, while foundational, are often insufficient in addressing the dynamic, distributed, and failure-prone nature of modern cloud ecosystems. This research presents a comprehensive, theoretically grounded framework that integrates chaos engineering, machine learning-based reliability modeling, and human-centered safety principles to enhance system robustness across cloud-native and safety-critical domains, including healthcare and autonomous systems.
The study synthesizes interdisciplinary perspectives from cloud computing, dependability engineering, fault injection methodologies, and AI-based safety analysis. It explores how experimental fault injection, particularly through chaos engineering practices, can be combined with predictive analytics to proactively identify and mitigate system vulnerabilities. Furthermore, the research emphasizes the importance of realism in error injection, the role of serverless architectures in resilience testing, and the integration of human factors in safety-critical environments.
A qualitative, theory-driven methodology is employed to construct a unified framework that bridges gaps between cloud system resilience and safety engineering in domains such as healthcare. The findings suggest that integrating chaos engineering with machine learning enhances predictive fault detection, improves failure propagation understanding, and supports adaptive system recovery mechanisms. Additionally, the study highlights that human-centered design and error taxonomy integration significantly contribute to reducing systemic risks in critical infrastructures.
The proposed framework offers a novel contribution by aligning chaos engineering practices with AI-driven reliability assessment and safety assurance principles. It provides a scalable and adaptable approach for organizations seeking to build resilient, trustworthy, and high-performance systems in increasingly complex technological landscapes.
Β
Keywords
References
Similar Articles
- Andras Varga, A Socio-Technical Framework for Error BudgetβDriven Reliability Governance in Cloud-Native and Edge-Integrated Distributed Systems , International Journal of Next-Generation Engineering and Technology: Vol. 3 No. 01 (2026): Volume 03 Issue 01
- Dr. Ethan Williams, Dr. Olivia Carter, Dr. Liam Anderson, Autonomous Fault Management in Cloud Environments Through Deep Learning-Based Decision Making , International Journal of Next-Generation Engineering and Technology: Vol. 3 No. 01 (2026): Volume 03 Issue 01
- Dr. Elena Markovic, Adaptive Latency-Aware Microservice Orchestration and Anomaly-Resilient EdgeβCloud Architectures for Mixed Reality and Time-Critical Applications , International Journal of Next-Generation Engineering and Technology: Vol. 1 No. 01 (2024): Volume 01 Issue 01
- Dr. Eleanor Whitmore, Cloud-Native Smart Health Platforms: Scalable Machine Learning Deployment for Cardiovascular Prediction through Heroku, Salesforce, and Urban Data Ecosystems , International Journal of Next-Generation Engineering and Technology: Vol. 3 No. 01 (2026): Volume 03 Issue 01
- Prof. Kavita Menon, An In-Depth Review of Recent Advances in Cables and Towed Objects for Ocean Engineering Towing Systems , International Journal of Next-Generation Engineering and Technology: Vol. 2 No. 08 (2025): Volume 02 Issue 08
- Elena M. Hartwell, Prof. Daniel K. Mercer, Dr. Sofia M. Alvarez, Adaptive and Secure Dynamic Voltage Restoration in Smart Power Networks: A Text-Based Integrative Research Study on PI-Controlled DVRs, Converter Coordination, Energy Management, and Cyber-Physical Resilience , International Journal of Next-Generation Engineering and Technology: Vol. 3 No. 04 (2026): Volume 03 Issue 04
- Dr. Clara E. Whitmore, Artificial Intelligence for Resilient Decentralized Infrastructures: An Integrative Research Study on Hybrid Renewable Energy Management and Real-Time Digital Payment Fraud Detection , International Journal of Next-Generation Engineering and Technology: Vol. 3 No. 04 (2026): Volume 03 Issue 04
- Samuel T. Ridgeway, Factory-Grade GPU Diagnostic Automation in Digital Pathology and Computational Inference Systems: A Cross-Domain Theoretical and Applied Investigation , International Journal of Next-Generation Engineering and Technology: Vol. 3 No. 01 (2026): Volume 03 Issue 01
- Clara Engelhardt, Resilient and Secure Time-Sensitive Architectures for Safety-Critical Cyber-Physical Systems: Integrating Predictability, Networking Standards, And Fault-Tolerant Design , International Journal of Next-Generation Engineering and Technology: Vol. 3 No. 01 (2026): Volume 03 Issue 01
- Dr. Eleanor M. Whitford, Deep Learning and Intelligent Control in High-Stakes Systems: An Integrative Research Study on Lung Cancer CT Diagnosis and AI-Enabled Electric Vehicle Grid Management , International Journal of Next-Generation Engineering and Technology: Vol. 3 No. 04 (2026): Volume 03 Issue 04
You may also start an advanced similarity search for this article.