International Journal of Modern Computer Science and IT Innovations

  1. Home
  2. Archives
  3. Vol. 2 No. 10 (2025): Volume 02 Issue 10
  4. Articles
International Journal of Modern Computer Science and IT Innovations

Article Details Page

Dynamic Deep Neural Network Partitioning For Low-Latency Edge-Assisted Video Analytics: A Learning-To-Partition Approach

Authors

  • Daniela Costa Department of Artificial Intelligence, Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS), Porto Alegre, Brazil
  • Rafael Lima Institute of Data Science and Analytics, Universidade Federal de Pernambuco (UFPE), Recife, Brazil

DOI:

https://doi.org/10.55640/

Keywords:

Deep Neural Network Partitioning, Edge Computing, Low-Latency Video Analytics

Abstract

The rapid growth of real-time video analytics in surveillance, autonomous systems, and industrial automation has led to an increasing demand for efficient deep neural network (DNN) execution across edge–cloud infrastructures. Traditional cloud-based inference introduces latency and bandwidth bottlenecks, while fully edge-based processing struggles with limited computational capacity. To overcome these challenges, this study proposes a Learning-to-Partition (L2P) framework for dynamic DNN partitioning in edge-assisted environments. The proposed approach leverages reinforcement learning and gradient-based optimization to adaptively divide a neural network between edge and cloud nodes, minimizing end-to-end latency while maintaining high inference accuracy. Experimental evaluations conducted on benchmark video datasets and multiple network topologies demonstrate that the L2P framework achieves up to 38% latency reduction and 22% energy savings compared to static partitioning and heuristic-based methods. Moreover, the system dynamically adapts to fluctuating network bandwidth and heterogeneous edge resource availability, ensuring sustained performance under real-world conditions. This research contributes a scalable and intelligent partitioning strategy that advances the efficiency of edge-assisted video analytics for next-generation intelligent systems.

References

Xiao, Z.; Xia, Z.; Zheng, H.; Zhao, B.Y.; Jiang, J. Towards performance clarity of edge video analytics. In Proceedings of the 2021 IEEE/ACM Symposium on Edge Computing (SEC), San Jose, CA, USA, 14–17 December 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 148–164.

Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767.

Ananthanarayanan, G.; Bahl, P.; Bodík, P.; Chintalapudi, K.; Philipose, M.; Ravindranath, L.; Sinha, S. Real-time video analytics: The killer app for edge computing. Computer 2017, 50, 58–67.

Fang, W.; Xu, W.; Yu, C.; Xiong, N.N. Joint Architecture Design and Workload Partitioning for DNN Inference on Industrial IoT Clusters. ACM Trans. Internet Technol. (TOIT) 2022, 23, 7.

Chadha, K. S. (2025). Edge AI for Real-Time ICU Alarm Fatigue Reduction: Federated Anomaly Detection on Wearable Streams. Utilitas Mathematica, 122(2), 291–308. Retrieved from https://utilitasmathematica.com/index.php/Index/article/view/2708

Mohammed, T.; Joe-Wong, C.; Babbar, R.; Di Francesco, M. Distributed inference acceleration with adaptive DNN partitioning and offloading. In Proceedings of the IEEE INFOCOM 2020-IEEE Conference on Computer Communications, Virtual, 2–5 May 2022; IEEE: Piscataway, NJ, USA, 2020; pp. 854–863.

Matsubara, Y.; Levorato, M.; Restuccia, F. Split computing and early exiting for deep learning applications: Survey and research challenges. ACM Comput. Surv. 2022, 55, 1–30.

Zhao, K.; Zhou, Z.; Chen, X.; Zhou, R.; Zhang, X.; Yu, S.; Wu, D. EdgeAdaptor: Online Configuration Adaption, Model Selection and Resource Provisioning for Edge DNN Inference Serving at Scale. IEEE Trans. Mob. Comput. 2022, 22, 5870–5886.

Hu, C.; Li, B. Distributed inference with deep learning models across heterogeneous edge devices. In Proceedings of the IEEE INFOCOM 2022-IEEE Conference on Computer Communications, Virtual, 2–5 May 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 330–339.

Liu, J.; Gao, G. CSVA: Complexity-Driven and Semantic-Aware Video Analytics via Edge-Cloud Collaboration. In Proceedings of the International Conference on Wireless Artificial Intelligent Computing Systems and Applications, Tokyo, Japan, 24–26 June 2025; pp. 107–116.

Dong, C.; Hu, S.; Chen, X.; Wen, W. Joint optimization with DNN partitioning and resource allocation in mobile edge computing. IEEE Trans. Netw. Serv. Manag. 2021, 18, 3973–3986.

Chen, B.; Yan, Z.; Nahrstedt, K. Context-aware image compression optimization for visual analytics offloading. In Proceedings of the 13th ACM Multimedia Systems Conference, Athlone, Ireland, 14–17 June 2022; pp. 27–38.

Chandra, R. (2025). Security and privacy testing automation for LLM-enhanced applications in mobile devices. International Journal of Networks and Security, 5(2), 30–41. https://doi.org/10.55640/ijns-05-02-02

Jiang, J.; Luo, Z.; Hu, C.; He, Z.; Wang, Z.; Xia, S.; Wu, C. Joint model and data adaptation for cloud inference serving. In Proceedings of the 2021 IEEE Real-Time Systems Symposium (RTSS), Dortmund, Germany, 7–10 December 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 279–289.

Chandra, R. (2025). Reducing latency and enhancing accuracy in LLM inference through firmware-level optimization. International Journal of Signal Processing, Embedded Systems and VLSI Design, 5(2), 26–36. https://doi.org/10.55640/ijvsli-05-02-02

Zhang, H.; Ananthanarayanan, G.; Bodik, P.; Philipose, M.; Bahl, P.; Freedman, M.J. Live video analytics at scale with approximation and delay-tolerance. In Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation, Boston, MA, USA, 27–29 March 2017.

LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444.

Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90.

Du, K.; Zhang, Q.; Arapin, A.; Wang, H.; Xia, Z.; Jiang, J. Accmpeg: Optimizing video encoding for video analytics. arXiv 2022, arXiv:2204.12534.

Liu, L.; Li, H.; Gruteser, M. Edge assisted real-time object detection for mobile augmented reality. In Proceedings of the 25th Annual International Conference on Mobile Computing and Networking, Los Cabos, Mexico, 21–25 October 2019; pp. 1–16.

Jiang, J.; Ananthanarayanan, G.; Bodik, P.; Sen, S.; Stoica, I. Chameleon: Scalable adaptation of video analytics. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication, Budapest, Hungary, 20–25 August 2018; pp. 253–266.

Wang, X.; Gao, G. SmartEye: An Open Source Framework for Real-Time Video Analytics with Edge-Cloud Collaboration. In Proceedings of the 29th ACM International Conference on Multimedia, Chengdu, China, 20–24 October 2021; pp. 3767–3770.

He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–90.

Gannavarapu, P. (2025). Performance optimization of hybrid Azure AD join across multi-forest deployments. Journal of Information Systems Engineering and Management, 10(45s), e575–e593. https://doi.org/10.55278/jisem.2025.10.45s.575

Chandra, R. (2025). Security and privacy testing automation for LLM-enhanced applications in mobile devices. International Journal of Networks and Security, 5(2), 30–41. https://doi.org/10.55640/ijns-05-02-02

Chen, J.; Ran, X. Deep learning with edge computing: A review. Proc. IEEE 2019, 107, 1655–1674.

Kang, Y.; Hauswald, J.; Gao, C.; Rovinski, A.; Mudge, T.; Mars, J.; Tang, L. Neurosurgeon: Collaborative intelligence between the cloud and mobile edge. ACM SIGARCH Comput. Archit. News 2017, 45, 615–629.

Hu, C.; Bao, W.; Wang, D.; Liu, F. Dynamic adaptive DNN surgery for inference acceleration on the edge. In Proceedings of the IEEE INFOCOM 2019-IEEE Conference on Computer Communications, Paris, France, 29 April–2 May 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1423–1431.

Shao, J.; Zhang, J. Communication-computation trade-off in resource-constrained edge inference. IEEE Commun. Mag. 2020, 58, 20–26.

Gao, G.; Dong, Y.; Wang, R.; Zhou, X. EdgeVision: Towards collaborative video analytics on distributed edges for performance maximization. IEEE Transactions on Multimedia 2024, 26, 9083–9094.

Dong, Y.; Gao, G. EdgeCam: A Distributed Camera Operating System for Inference Scheduling and Continuous Learning. In Proceedings of the 2024 IEEE/ACM Ninth International Conference on Internet-of-Things Design and Implementation (IoTDI), Hong Kong, China, 13–16 May 2024; pp. 225–226.

Ran, X.; Chen, H.; Zhu, X.; Liu, Z.; Chen, J. Deepdecision: A mobile deep learning framework for edge video analytics. In Proceedings of the IEEE INFOCOM 2018-IEEE Conference on Computer Communications, Honolulu, HI, USA, 15–19 April 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1421–1429.

Lulla, K.; Chandra, R.; & Ranjan, K. (2025). Factory-grade diagnostic automation for GeForce and data centre GPUs. International Journal of Engineering, Science and Information Technology, 5(3), 537–544. https://doi.org/10.52088/ijesty.v5i3.1089

Laskaridis, S.; Venieris, S.I.; Almeida, M.; Leontiadis, I.; Lane, N.D. SPINN: Synergistic progressive inference of neural networks over device and cloud. In Proceedings of the 26th Annual International Conference on Mobile Computing and Networking, London, UK, 21–25 September 2020; pp. 1–15.

Lulla, K. L., Chandra, R. C., & Sirigiri, K. S. (2025). Proxy-based thermal and acoustic evaluation of cloud GPUs for AI training workloads. The American Journal of Applied Sciences, 7(7), 111–127. https://doi.org/10.37547/tajas/Volume07Issue07-12

Zeng, L.; Chen, X.; Zhou, Z.; Yang, L.; Zhang, J. Coedge: Cooperative dnn inference with adaptive workload partitioning over heterogeneous edge devices. IEEE/ACM Trans. Netw. 2020, 29, 595–608.

Tang, X.; Chen, X.; Zeng, L.; Yu, S.; Chen, L. Joint multiuser dnn partitioning and computational resource allocation for collaborative edge intelligence. IEEE Internet Things J. 2020, 8, 9511–9522.

Lulla, K. (2025). Python-based GPU testing pipelines: Enabling zero-failure production lines. Journal of Information Systems Engineering and Management, 10(47s), 978–994. https://doi.org/10.55278/jisem.2025.10.47s.978

Peng, S.; Shen, Z.; Zheng, Q.; Hou, X.; Jiang, D.; Yuan, J.; Jin, J. APT-SAT: An Adaptive DNN Partitioning and Task Offloading Framework within Collaborative Satellite Computing Environments. IEEE Trans. Netw. Sci. Eng. 2025.

Shao, J.; Zhang, J. Bottlenet++: An end-to-end approach for feature compression in device-edge co-inference systems. In Proceedings of the 2020 IEEE International Conference on Communications Workshops (ICC Workshops), Dublin, Ireland, 7–11 June 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–6.

Zhang, M.; Fang, J.; Teng, Z.; Liu, Y.; Wu, S. Joint DNN Partitioning and Task Offloading Based on Attention Mechanism-Aided Reinforcement Learning. IEEE Trans. Netw. Serv. Manag. 2025, 22, 2914–2927.

Liu, J.; Gao, G. CSVA: Complexity-Driven and Semantic-Aware Video Analytics via Edge-Cloud Collaboration. In Proceedings of the International Conference on Wireless Artificial Intelligent Computing Systems and Applications, Tokyo, Japan, 24–26 June 2025; pp. 107–116.

Li, H.; Hu, C.; Jiang, J.; Wang, Z.; Wen, Y.; Zhu, W. Jalad: Joint accuracy-and latency-aware deep structure decoupling for edge-cloud execution. In Proceedings of the 2018 IEEE 24th International Conference on Parallel and Distributed Systems (ICPADS), Singapore, 11–13 December 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 671–678.

Reddy Gundla, S. (2025). PostgreSQL Tuning for Cloud-Native Java: Connection Pooling vs. Reactive Drivers. International Journal of Computational and Experimental Science and Engineering, 11(3). https://doi.org/10.22399/ijcesen.3479

Kumar Enugala, V. (2025). Quantum Sensors for Micro-Corrosion Detection. International Journal of Computational and Experimental Science and Engineering, 11(3). https://doi.org/10.22399/ijcesen.3481

Downloads

Published

2025-10-18

How to Cite

Dynamic Deep Neural Network Partitioning For Low-Latency Edge-Assisted Video Analytics: A Learning-To-Partition Approach. (2025). International Journal of Modern Computer Science and IT Innovations, 2(10), 40-50. https://doi.org/10.55640/

How to Cite

Dynamic Deep Neural Network Partitioning For Low-Latency Edge-Assisted Video Analytics: A Learning-To-Partition Approach. (2025). International Journal of Modern Computer Science and IT Innovations, 2(10), 40-50. https://doi.org/10.55640/

Similar Articles

1-10 of 17

You may also start an advanced similarity search for this article.