Independent Researcher, USA.
World Journal of Advanced Engineering Technology and Sciences, 2026, 18(03), 215-223
Article DOI: 10.30574/wjaets.2026.18.3.0113
Received on 09 January 2026; revised on 21 February 2026; accepted on 24 February 2026
This research paper examines the evolution and current state of serverless computing frameworks for deploying artificial intelligence (AI) models in real-time applications. As AI adoption accelerates across industries, the need for efficient, scalable, and cost-effective deployment solutions has become increasingly critical. Serverless computing offers a promising approach for AI model deployment by providing on-demand computational resources without infrastructure management overhead. This study evaluates five prominent serverless computing frameworks—AWS Lambda, Google Cloud Functions, Azure Functions, OpenWhisk, and Knative—against key performance metrics relevant to AI workloads. Through empirical testing with various AI model types and sizes, we analyze cold start latency, execution time, throughput, cost efficiency, and resource utilization. Our findings indicate that while all frameworks demonstrate viable paths for AI deployment, significant differences exist in their performance characteristics depending on model complexity, input size, and concurrency requirements. We identify specific optimization strategies and architectural patterns that enhance real-time AI deployment on serverless platforms and propose a decision framework to guide implementation choices based on application requirements and constraints.
Serverless Computing; Function-as-a-Service (FaaS); Artificial Intelligence; Machine Learning; Model Deployment; Cloud Computing; Real-time Systems
Get Your e Certificate of Publication using below link
Preview Article PDF
Manoj Bhoyar. Serverless computing frameworks for real-time AI model deployment. World Journal of Advanced Engineering Technology and Sciences, 2026, 18(03), 215-223. Article DOI: https://doi.org/10.30574/wjaets.2026.18.3.0113