Open Access

INTEGRATING LAKEHOUSE ARCHITECTURES AND CLOUD DATA WAREHOUSING FOR NEXT-GENERATION ENTERPRISE ANALYTICS

4 Sorbonne UniversitΓ©, Paris, France

Abstract

The exponential growth of digital data across diverse domains has necessitated the evolution of advanced data storage and analytical frameworks capable of handling high-velocity, high-volume, and high-variety datasets. Traditional data warehousing approaches, while robust for structured data and reporting, often struggle to accommodate the scale, flexibility, and real-time processing requirements imposed by modern enterprises. Emerging paradigms, including data lakes, lakehouses, and cloud-native data warehousing platforms, seek to reconcile the strengths of structured and unstructured data management, providing unified solutions for complex analytical workflows. This paper critically examines the integration of lakehouse architectures with cloud-based data warehousing systems, with a particular focus on Amazon Redshift as a representative cloud-native solution (Worlikar, Patel, & Challa, 2025). By synthesizing theoretical underpinnings, empirical implementations, and performance analyses, the study elucidates the operational, computational, and strategic implications of adopting hybrid data architectures. Key contributions include a comprehensive evaluation of ACID-compliant storage solutions such as Delta Lake, Apache Iceberg, and Hudi; the operationalization of machine learning pipelines in production contexts; and the nuanced role of metadata management in ensuring data governance and reproducibility. The findings underscore the transformative potential of integrated lakehouse and cloud data warehousing models for enterprise-scale analytics, highlighting best practices for design, deployment, and optimization while addressing critical limitations and open research questions. The paper concludes by proposing a structured framework for future adoption, emphasizing scalability, interoperability, and the alignment of technical capabilities with organizational objectives.

Β 

Keywords

References

πŸ“„ David Naseh, et al. Real-World Implementation and Performance Analysis of Distributed Learning Frameworks for 6G IoT Applications. Information, 2024.
πŸ“„ Michael Armbrust, et al. Delta Lake: High-Performance ACID Table Storage over Cloud Object Stores. Proceedings of the VLDB Endowment, 2020.
πŸ“„ Worlikar, S., Patel, H., & Challa, A. Amazon Redshift Cookbook: Recipes for building modern data warehousing solutions. Packt Publishing Ltd., 2025.
πŸ“„ Z. Bicevska and I. Oditis. Towards NoSQL-based Data Warehouse Solutions. Procedia Computer Science, 2017.
πŸ“„ Paul Iusztin. A Framework for Building a Production-Ready Feature Engineering Pipeline. Medium, 2023.
πŸ“„ R. L. Grossman. Data Lakes, Clouds, and Commons: A Review of Platforms for Analyzing and Sharing Genomic Data. Trends in Genetics, 2019.
πŸ“„ Naresh Dulam. Mastering Open Table Formats: A Guide to Apache Iceberg, Hudi, and Delta Lake. Medium, 2024.
πŸ“„ Beekeeper. Operational Excellence? Definitions, Tips, and Best Practices Revealed. Beekeeper, 2021.
πŸ“„ Michael Armbrust, et al. Lakehouse: A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics. CIDR, 2021.
πŸ“„ C. Giebler, et al. Leveraging the Data Lake: Current State and Challenges. Springer, 2019.
πŸ“„ Pavel Klushin. Feature Store Benefits: The Advantages of Feature Stores in Machine Learning Development. JFrog Blog, 2024.
πŸ“„ Sabarinathan Sampath. The Evolution of the Lakehouse: Bridging Data Lakes and Warehouses. LinkedIn, 2024.
πŸ“„ Dhuha A. Al-kazzaz. Instrumentalization of Machine Learning in Architectural Design. International Review of Applied Sciences and Engineering, 2025.
πŸ“„ Dhuha A. Al-kazzaz. Instrumentalization of Machine Learning in Architectural Design. International Review of Applied Sciences and Engineering, 2025.
πŸ“„ Dhuha A. Al-kazzaz. Instrumentalization of Machine Learning in Architectural Design. International Review of Applied Sciences and Engineering, 2025.
πŸ“„ Dhuha A. Al-kazzaz. Instrumentalization of Machine Learning in Architectural Design. International Review of Applied Sciences and Engineering, 2025.

Similar Articles

1-10 of 37

You may also start an advanced similarity search for this article.