Home
World Journal of Advanced Engineering Technology and Sciences
International, Peer reviewed, Referred, Open access | ISSN Approved Journal

Main navigation

  • Home
    • Journal Information
    • Abstracting and Indexing
    • Editorial Board Members
    • Reviewer Panel
    • Journal Policies
    • WJAETS CrossMark Policy
    • Publication Ethics
    • Instructions for Authors
    • Article processing fee
    • Track Manuscript Status
    • Get Publication Certificate
    • Issue in Progress
    • Current Issue
    • Past Issues
    • Become a Reviewer panel member
    • Join as Editorial Board Member
  • Contact us
  • Downloads

ISSN: 2582-8266 (Online)  || UGC Compliant Journal || Google Indexed || Impact Factor: 9.48 || Crossref DOI

Fast Publication within 2 days || Low Article Processing charges || Peer reviewed and Referred Journal

Research and review articles are invited for publication in Volume 18, Issue 2 (February 2026).... Submit articles

Scaling deep learning models: Challenges and solutions for large-scale deployments

Breadcrumb

  • Home
  • Scaling deep learning models: Challenges and solutions for large-scale deployments

Ankush Jitendrakumar Tyagi *

University of Texas at Arlington, Texas, USA.

Review Article

World Journal of Advanced Engineering Technology and Sciences, 2025, 16(02), 010–020

Article DOI: 10.30574/wjaets.2025.16.2.1252

DOI url: https://doi.org/10.30574/wjaets.2025.16.2.1252

Received on 26 June 2025; revised on 28 July 2025; accepted on 31 July 2025

Deep learning (DL) models have achieved state-of-the-art performance across numerous domains, including natural language processing, computer vision, and speech recognition. However, the transition from research to production, especially at large scales, presents formidable challenges. As model sizes balloon into billions of parameters and user demand scales exponentially, issues such as training time, inference latency, energy consumption, system reliability, and hardware constraints become significant obstacles. Efficiently scaling DL models is not just a matter of model architecture; it requires a multi-faceted approach encompassing algorithmic, infrastructural, and deployment-level strategies. Large-scale deployments must account for factors such as distributed training across heterogeneous hardware, maintaining inference throughput under real-time constraints, handling memory and communication bottlenecks, and ensuring deployment flexibility from cloud clusters to edge devices. The performance and cost-efficiency of DL systems at scale hinge upon techniques such as model and data parallelism, quantisation, mixed-precision training, and sharded inference. Additionally, orchestration tools like Kubernetes, together with specialised inference runtimes such as TensorRT and NVIDIA Triton, are critical for automated, scalable deployment pipelines. This paper presents a deep technical analysis of the core challenges inherent in scaling DL models, examines modern solutions and their trade-offs, and proposes an integrated framework to address real-world deployment needs. By combining innovations at both the model level and system infrastructure level, the goal is to enable resilient, scalable, and production-grade AI deployments.

Deep Learning Scalability; Large-Scale AI Deployment; Distributed Training; Inference Optimization; Model Parallelism

https://wjaets.com/sites/default/files/fulltext_pdf/WJAETS-2025-1252.pdf

Preview Article PDF

Ankush Jitendrakumar Tyagi. Scaling deep learning models: Challenges and solutions for large-scale deployments. World Journal of Advanced Engineering Technology and Sciences, 2025, 16(02), 010-020. Article DOI: https://doi.org/10.30574/wjaets.2025.16.2.1252.

Get Certificates

Get Publication Certificate

Download LoA

Check Corssref DOI details

Issue details

Issue Cover Page

Editorial Board

Table of content


Copyright © Author(s). All rights reserved. This article is published under the terms of the Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as appropriate credit is given to the original author(s) and source, a link to the license is provided, and any changes made are indicated.


Copyright © 2026 World Journal of Advanced Engineering Technology and Sciences

Developed & Designed by VS Infosolution