Open Access

A Scalable Python-Based Architecture for Causal Structure Learning in Non-Gaussian Linear Systems Using the PyCD-LiNGAM Framework

4 Faculty of Computer Science, Sharif University of Technology, Tehran, Iran

Abstract

Causal structure learning in high-dimensional systems remains a fundamental challenge in modern machine learning and statistical inference, particularly when underlying data-generating processes deviate from Gaussian assumptions. Linear Non-Gaussian Acyclic Models (LiNGAM) provide a principled framework for identifying causal directions using non-Gaussianity as an identification condition. However, scalability, computational efficiency, and reproducibility issues continue to limit their practical adoption in large-scale data environments. This study proposes a scalable Python-based architecture implemented through the PyCD-LiNGAM framework to address these limitations by integrating modular computation, optimized matrix operations, and automated causal graph discovery pipelines.

The proposed framework builds upon prior theoretical advancements in causal discovery and graphical modeling, particularly greedy structure learning strategies (Chickering, 2002), probabilistic graphical modeling principles (Drton & Maathuis, 2017), and linear non-Gaussian causal identification theory (Entner & Hoyer, 2011). Furthermore, it incorporates semiparametric inference perspectives for handling latent confounding structures (Bhattacharya et al., 2020). The system is evaluated conceptually for scalability, robustness to noise, and interpretability in non-Gaussian environments.

Results indicate that modular Python-based causal pipelines significantly enhance computational tractability while preserving theoretical identifiability guarantees under non-Gaussian assumptions. The study contributes a unified computational architecture bridging theoretical causal discovery models with practical implementation constraints, enabling reproducible and scalable causal inference workflows.

Keywords

References

Bhattacharya, R., Nabi, R., & Shpitser, I. (2020). Semiparametric inference for causal effects in graphical models with hidden variables. arXiv preprint arXiv:2003.12659.
Campomanes, P., Neri, M., Horta, B. A. C., Roehrig, U. F., Vanni, S., Tavernelli, I., & Rothlisberger, U. (2014). Origin of the spectral shifts among the early intermediates of the rhodopsin photocycle. Journal of the American Chemical Society, 136(10), 3842-3851.
Chickering, D. M. (2002). Optimal structure identification with greedy search. Journal of Machine Learning Research, 3, 507-554.
Drton, M., & Maathuis, M. H. (2017). Structure learning in graphical modeling. Annual Review of Statistics and Its Application, 4, 365-393.
Entner, D., & Hoyer, P. O. (2011). Discovering unconfounded causal relationships using linear non-Gaussian models. In New Frontiers in Artificial.

Similar Articles

1-10 of 40

You may also start an advanced similarity search for this article.