Adaptive Chaos Engineering and AI-Driven Dependability Modeling for Resilient Cloud-Native and Safety-Critical Systems
Abstract
The increasing reliance on cloud-native architectures, serverless computing, and artificial intelligence-driven systems has introduced new complexities in ensuring system dependability, resilience, and safety. Traditional reliability engineering approaches, while foundational, are often insufficient in addressing the dynamic, distributed, and failure-prone nature of modern cloud ecosystems. This research presents a comprehensive, theoretically grounded framework that integrates chaos engineering, machine learning-based reliability modeling, and human-centered safety principles to enhance system robustness across cloud-native and safety-critical domains, including healthcare and autonomous systems.
The study synthesizes interdisciplinary perspectives from cloud computing, dependability engineering, fault injection methodologies, and AI-based safety analysis. It explores how experimental fault injection, particularly through chaos engineering practices, can be combined with predictive analytics to proactively identify and mitigate system vulnerabilities. Furthermore, the research emphasizes the importance of realism in error injection, the role of serverless architectures in resilience testing, and the integration of human factors in safety-critical environments.
A qualitative, theory-driven methodology is employed to construct a unified framework that bridges gaps between cloud system resilience and safety engineering in domains such as healthcare. The findings suggest that integrating chaos engineering with machine learning enhances predictive fault detection, improves failure propagation understanding, and supports adaptive system recovery mechanisms. Additionally, the study highlights that human-centered design and error taxonomy integration significantly contribute to reducing systemic risks in critical infrastructures.
The proposed framework offers a novel contribution by aligning chaos engineering practices with AI-driven reliability assessment and safety assurance principles. It provides a scalable and adaptable approach for organizations seeking to build resilient, trustworthy, and high-performance systems in increasingly complex technological landscapes.
Β
Keywords
References
Similar Articles
- Andras Varga, A Socio-Technical Framework for Error BudgetβDriven Reliability Governance in Cloud-Native and Edge-Integrated Distributed Systems , International Journal of Next-Generation Engineering and Technology: Vol. 3 No. 01 (2026): Volume 03 Issue 01
- Dr. Elena Markovic, Adaptive Latency-Aware Microservice Orchestration and Anomaly-Resilient EdgeβCloud Architectures for Mixed Reality and Time-Critical Applications , International Journal of Next-Generation Engineering and Technology: Vol. 1 No. 01 (2024): Volume 01 Issue 01
- Dr. Eleanor Whitmore, Cloud-Native Smart Health Platforms: Scalable Machine Learning Deployment for Cardiovascular Prediction through Heroku, Salesforce, and Urban Data Ecosystems , International Journal of Next-Generation Engineering and Technology: Vol. 3 No. 01 (2026): Volume 03 Issue 01
- Prof. Kavita Menon, An In-Depth Review of Recent Advances in Cables and Towed Objects for Ocean Engineering Towing Systems , International Journal of Next-Generation Engineering and Technology: Vol. 2 No. 08 (2025): Volume 02 Issue 08
- Samuel T. Ridgeway, Factory-Grade GPU Diagnostic Automation in Digital Pathology and Computational Inference Systems: A Cross-Domain Theoretical and Applied Investigation , International Journal of Next-Generation Engineering and Technology: Vol. 3 No. 01 (2026): Volume 03 Issue 01
- Dr. Ahmed A. Al-Mansoori, Dr. Fatimah H. Zayed, RENEWABLE DISTRIBUTED GENERATION: TRANSFORMING POWER SYSTEMS FOR A SUSTAINABLE FUTURE , International Journal of Next-Generation Engineering and Technology: Vol. 2 No. 04 (2025): Volume 02 Issue 04
- Alaric Whitemore, The Architecture of Quality: Integrating Machine Learning, Blockchain, and Automated Analysis for the Evolution of Secure and Modular Software Systems , International Journal of Next-Generation Engineering and Technology: Vol. 1 No. 01 (2024): Volume 01 Issue 01
- Xavier P. Lockwood, From Reactive IT to Cognitive Operations: The Evolution of AI-Driven DevOps in Large-Scale Software Systems , International Journal of Next-Generation Engineering and Technology: Vol. 3 No. 02 (2026): Volume 03 Issue 02
- Dr. Theresa Vance, Advanced Paradigms In 10G Automotive Ethernet: Integrating Hyperlynx-Validated Electromagnetic Shielding, Sustainable Printed Electronics, And Adaptive Control for Next-Generation ADAS Architectures , International Journal of Next-Generation Engineering and Technology: Vol. 3 No. 02 (2026): Volume 03 Issue 02
- Dr. Julian Thorne, Advanced Taxonomic Characterization and Algorithmic Optimization of Distributed Stream Processing Workloads: A Multi-Dimensional Analysis of Hybrid Cloud Resource Orchestration , International Journal of Next-Generation Engineering and Technology: Vol. 3 No. 01 (2026): Volume 03 Issue 01
You may also start an advanced similarity search for this article.