Home
World Journal of Advanced Engineering Technology and Sciences
International, Peer reviewed, Referred, Open access | ISSN Approved Journal

Main navigation

  • Home
    • Journal Information
    • Abstracting and Indexing
    • Editorial Board Members
    • Reviewer Panel
    • Journal Policies
    • WJAETS CrossMark Policy
    • Publication Ethics
    • Instructions for Authors
    • Article processing fee
    • Track Manuscript Status
    • Get Publication Certificate
    • Issue in Progress
    • Current Issue
    • Past Issues
    • Become a Reviewer panel member
    • Join as Editorial Board Member
  • Contact us
  • Downloads

ISSN: 2582-8266 (Online)  || UGC Compliant Journal || Google Indexed || Impact Factor: 9.48 || Crossref DOI

Fast Publication within 2 days || Low Article Processing charges || Peer reviewed and Referred Journal

Research and review articles are invited for publication in Volume 18, Issue 2 (February 2026).... Submit articles

AI powered voice synthesizer

Breadcrumb

  • Home
  • AI powered voice synthesizer

V. Vanaja, Venkatesham Tunge, Nithin Kumar Kanagala *, Harsha Vardhan Bhumandla and Shruti Kana

Department of CSE (Data Science), ACE Engineering College, Hyderabad, Telangana, India.

Research Article

World Journal of Advanced Engineering Technology and Sciences, 2025, 15(02), 663-671

Article DOI: 10.30574/wjaets.2025.15.2.0590

DOI url: https://doi.org/10.30574/wjaets.2025.15.2.0590

Received on 22 March 2025; revised on 02 May 2025; accepted on 04 May 2025

The AI Voice Synthesizer is an advanced real-time, multilingual voice cloning system that utilizes state-of-the-art deep learning techniques to generate personalized speech with high naturalness and accuracy. Built on the open-source Coqui.ai’s XTTSv2 framework, the system enables users to synthesize speech using their own voice—or any voice sample—by analyzing just a few seconds of audio. It then uses this voice profile to generate natural-sounding speech in multiple languages, even those the original speaker has never spoken, offering a revolutionary leap in the field of synthetic speech and human-computer interaction.

Traditional text-to-speech (TTS) systems often suffer from robotic tone, lack of personalization, limited language support, and high latency. In contrast, this project provides a lightweight, low-latency (<200 ms), and user-friendly platform that supports cross-lingual, few-shot voice cloning. Designed with modularity in mind, the system consists of several independent components: speaker embedding extraction, multilingual text processing, real-time speech synthesis, and a web-based front end. These components are integrated into a seamless workflow that is intuitive and accessible for non-technical users, while also being scalable and customizable for developers and researchers.

Real-Time Speech Synthesis; Few-Shot Text-To-Speech; Multilingual TTS; Coqui.AI Speaker Embedding; Personalized Synthetic Voice; Real-Time Voice Cloning System

https://wjaets.com/sites/default/files/fulltext_pdf/WJAETS-2025-0590.pdf

Preview Article PDF

V. Vanaja, Venkatesham Tunge, Nithin Kumar Kanagala, Harsha Vardhan Bhumandla and Shruti Kana. AI powered voice synthesizer. World Journal of Advanced Engineering Technology and Sciences, 2025, 15(02), 663-671. Article DOI: https://doi.org/10.30574/wjaets.2025.15.2.0590.

Get Certificates

Get Publication Certificate

Download LoA

Check Corssref DOI details

Issue details

Issue Cover Page

Editorial Board

Table of content


Copyright © Author(s). All rights reserved. This article is published under the terms of the Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as appropriate credit is given to the original author(s) and source, a link to the license is provided, and any changes made are indicated.


Copyright © 2026 World Journal of Advanced Engineering Technology and Sciences

Developed & Designed by VS Infosolution