Formal Operational Models for Protecting Web Interfaces of Legal LLM Systems from Prompt Injection and Insecure Output Handling

Grigorii Danileiko

doi:10.55640/ijaair-v03i05-03

Open Access

Formal Operational Models for Protecting Web Interfaces of Legal LLM Systems from Prompt Injection and Insecure Output Handling

https://doi.org/10.55640/ijaair-v03i05-03

PDF

Grigorii Danileiko ¹

,

⁴ Agiloft Canada, Inc.; University Researcher

Abstract

The proliferation of large language model (LLM) systems in legal technology platforms has created a new class of web-interface security vulnerabilities that existing application security frameworks address incompletely. This paper examines prompt injection and insecure output handling as the two primary attack surfaces for legal LLM web applications, with particular attention to contract lifecycle management systems that expose natural-language interfaces to privileged document repositories. Drawing on a systematic review of current OWASP LLM Top 10 guidance, peer-reviewed security literature, and practitioner case analyses, the study proposes a structured compositional operational model in which each processing stage of an LLM web pipeline is represented as a transformation function with explicitly stated security constraints. The model introduces six operators, Sanitize, Contextualize, Policy-Check, Infer, Encode, and Validate, composed in a single end-to-end pipeline whose behavior is described through finite-state transitions and trust-level tagging. The analysis indicates that the proposed compositional model can support systematic enumeration of attack paths and can be translated into an implementation-oriented checklist for practitioners. The findings are relevant to security architects, front-end engineers, and legal technology product teams who design or audit LLM-integrated web applications.

Keywords

Prompt injection, insecure output handling, LLM security

References

OWASP GenAI Security Project. (2025). OWASP Top 10 for Large Language Model Applications 2025. Retrieved from: https://genai.owasp.org/llm-top-10/ (date accessed: November 5, 2025).

Ferrag, M. A., Tihanyi, N., Hamouda, D., Maglaras, L., Lakas, A., & Debbah, M. (2026). From prompt injections to protocol exploits: Threats in LLM-powered AI agents workflows. ICT Express, 12(2), 353–383. https://doi.org/10.1016/j.icte.2025.12.001.

Gulyamov, S., Gulyamov, S., Rodionov, A., Khursanov, R., Mekhmonov, K., Babaev, D., & Rakhimjonov, A. (2026). Prompt injection attacks in large language models and AI agent systems: A comprehensive review of vulnerabilities, attack vectors, and defense mechanisms. Information, 17(1), Article 54. https://doi.org/10.3390/info17010054.

Johnson, S., Pham, V., & Le, T. (2025). The dangers of indirect prompt injection attacks on LLM-based autonomous web navigation agents: A demonstration. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: System Demonstrations (pp. 729–738). Association for Computational Linguistics. https://doi.org/10.18653/v1/2025.emnlp-demos.55.

Hu, Y., Fan, C., Samyoun, S., & Du, J. (2025). Log-To-Leak: Prompt injection attacks on tool-using LLM agents via Model Context Protocol. OpenReview. Retrieved from: https://openreview.net/forum?id=UVgbFuXPaO (date accessed: February 12, 2026).

Liu, Y., Deng, G., Li, Y., Wang, K., Wang, Z., Wang, X., Zhang, T., Liu, Y., Wang, H., Zheng, Y., Zhang, L. Y., & Liu, Y. (2023). Prompt injection attack against LLM-integrated applications. arXiv. https://doi.org/10.48550/arXiv.2306.05499.

Wichers, N., Denison, C., & Beirami, A. (2024). Gradient-based language model red teaming. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 2862–2881). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.eacl-long.175.

Gong, N. (2025). Securing LLM agents against prompt injection attacks. Duke University. Retrieved from: https://people.duke.edu/~zg70/code/PromptInjection.pdf (date accessed: January 14, 2026).

Unit 42, Palo Alto Networks. (2025). New prompt injection attack vectors through MCP sampling. Retrieved from: https://unit42.paloaltonetworks.com/model-context-protocol-attack-vectors/ (date accessed: December 18, 2025).

Beurer-Kellner, L., Buesser, B., Creţu, A.-M., Debenedetti, E., Dobos, D., Fabian, D., Fischer, M., Froelicher, D., Grosse, K., Naeff, D., Ozoani, E., Paverd, A., Tramèr, F., & Volhejn, V. (2025). Design patterns for securing LLM agents against prompt injections. arXiv. https://doi.org/10.48550/arXiv.2506.08837.

Greshake, K., Abdelnabi, S., Mishra, S., Endres, C., Holz, T., & Fritz, M. (2023). Not what you’ve signed up for: Compromising real-world LLM-integrated applications with indirect prompt injection. In Proceedings of the 16th ACM Workshop on Artificial Intelligence and Security (AISec ’23) (pp. 79–90). Association for Computing Machinery. https://doi.org/10.1145/3605764.3623985.

Shi, J., Yuan, Z., Liu, Y., Huang, Y., Zhou, P., Sun, L., & Gong, N. Z. (2024). Optimization-based prompt injection attack to LLM-as-a-judge. In Proceedings of the 2024 ACM SIGSAC Conference on Computer and Communications Security (pp. 660–674). Association for Computing Machinery. https://doi.org/10.1145/3658644.3690291.

OWASP GenAI Security Project. (2025). LLM01:2025 Prompt injection. Retrieved from: https://genai.owasp.org/llmrisk/llm01-prompt-injection/ (date accessed: November 17, 2025).

Hou, X., Zhao, Y., Wang, S., & Wang, H. (2025). Model Context Protocol (MCP): Landscape, security threats, and future research directions. arXiv. https://doi.org/10.48550/arXiv.2503.23278.

OWASP. (2024). OWASP Top 10 for LLM Applications 2025 (PDF). Retrieved from: https://owasp.org/www-project-top-10-for-large-language-model-applications/assets/PDF/OWASP-Top-10-for-LLMs-v2025.pdf (date accessed: November 21, 2025).

Cloudflare. (n.d.). What are the OWASP Top 10 risks for LLMs? Retrieved from: https://www.cloudflare.com/learning/ai/owasp-top-10-risks-for-llms/ (date accessed: February 4, 2026).

Jia, Y., Shao, Z., Liu, Y., Jia, J., Song, D., & Gong, N. Z. (2025). A critical evaluation of defenses against prompt injection attacks. arXiv. https://doi.org/10.48550/arXiv.2505.18333.

Li, H., Liu, X., Zhang, N., & Xiao, C. (2025). PIGuard: Prompt injection guardrail via mitigating overdefense for free. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 30420–30437). Association for Computational Linguistics. https://doi.org/10.18653/v1/2025.acl-long.1468.

Jia, Y., Liu, Y., Shao, Z., Jia, J., & Gong, N. Z. (2025). PromptLocate: Localizing prompt injection attacks. arXiv. https://doi.org/10.48550/arXiv.2510.12252.

Hui, B., Yuan, H., Gong, N., Burlina, P., & Cao, Y. (2024). PLeak: Prompt leaking attacks against large language model applications. In Proceedings of the 2024 ACM SIGSAC Conference on Computer and Communications Security (pp. 3600–3614). Association for Computing Machinery. https://doi.org/10.1145/3658644.3670370.

International Journal of Advanced Artificial Intelligence Research

Formal Operational Models for Protecting Web Interfaces of Legal LLM Systems from Prompt Injection and Insecure Output Handling

Abstract

Keywords

References

Similar Articles