Voice Cloning Statistics 2026: 47+ Data Points on Market Growth, Adoption, and Fraud Risks

47+ voice cloning statistics for 2026, covering market size, adoption by industry, latency benchmarks, and the fraud surge regulators are racing to contain. Every figure sourced to primary research from Pindrop, FTC, McKinsey, Pew, McAfee, FCC, and the EU AI Act.

ElevenLabs reached an $11 billion valuation in February 2026 after raising $500M from Sequoia Capital (Bloomberg, 2026). The global voice cloning market grew to $2.4 billion in 2025 and is projected to hit $9.6 billion by 2030 at a 26% CAGR (Mordor Intelligence, Voice Cloning Market Report 2025). At the same time, Pindrop tracked a 680% year-over-year increase in deepfake voice activity and a 1,300% surge in contact-center fraud attempts (Pindrop, 2025 Voice Intelligence and Security Report).

We aggregated data from the U.S. Federal Trade Commission, the FBI Internet Crime Complaint Center (IC3), the Federal Communications Commission, the European Commission, McKinsey, Pindrop, McAfee, Pew Research Center, Audible, Mordor Intelligence, and a dozen primary reports to build the most current picture of where voice cloning stands in 2026 — and where it is heading.

Key Takeaways

  • ElevenLabs raised $500M Series D from Sequoia Capital at an $11 billion valuation in February 2026 (Bloomberg, 2026).
  • ElevenLabs ARR reached $500M in April 2026, up from $330M at the end of 2025 (Sacra / TechCrunch, 2026).
  • The global voice cloning market reached $2.4B in 2025 and is projected to hit $9.6B by 2030 at a 26% CAGR (Mordor Intelligence, 2025).
  • Pindrop tracked a 680% YoY increase in deepfake voice activity across its enterprise customer base (Pindrop, 2025 Voice Intelligence and Security Report).
  • Contact-center deepfake fraud attempts surged 1,300% — from roughly one per month to seven per day on average (Pindrop, 2025).
  • U.S. FTC logged over 1 million imposter-scam reports in 2025, with losses of $3.5 billion — the #1 scam category for nine years running (FTC, 2025).
  • 25% of adults globally said they or someone they know experienced an AI voice scam (McAfee, The Artificial Imposter 2023).
  • 70% of surveyed adults said they could not reliably distinguish a cloned voice from the real person (McAfee, 2023).
  • 88% of organizations use AI in at least one business function and 71% regularly deploy generative AI (McKinsey, State of AI 2025).
  • The FCC ruled AI-generated voices in robocalls illegal under the TCPA, with fines up to $23,000 per call (FCC, February 2024).
  • The EU AI Act’s transparency obligations (Article 50) for AI providers, including synthetic voice, apply from August 2, 2026 (European Commission / EU AI Act, 2026).
  • Voice cloning latency in 2026 benchmarks at 40–150 ms for leading models (Cartesia, ElevenLabs Flash v2.5, CosyVoice2).

1. Market Size and Growth Projections

The voice cloning market is in early-stage hyper-growth — multiple firms project a 25–28% CAGR through 2030, which is roughly double the broader speech-AI category. The variance across reports (from $2.4B to $3.3B for 2025) reflects methodology differences: some include only standalone cloning platforms (ElevenLabs, Resemble), others include voice cloning embedded inside larger TTS or contact-center products.

Voice cloning market, 2024–2030 (USD billions) $12B $9B $6B $3B $2.7 $3.4 $4.3 $5.4 $6.8 $8.5 $10.8 2024 2025 2026 2027 2028 2029 2030
Figure 1 — Voice cloning market trajectory. Linear interpolation between firm-reported endpoints at 26% CAGR. Source: Mordor Intelligence, IMARC Group (2024–2025 reports).
MetricValueSource
Voice cloning market (2024)~$2.7 billionIMARC Group, Voice Cloning Market Report 2024
Voice cloning market (2025)$2.4–3.3 billion (varies by scope)Mordor Intelligence / The Business Research Company, 2025
Voice cloning market projection (2030)$9.6–10.8 billionMordor Intelligence / IMARC, 2025
Voice cloning CAGR (2024–2030)26.0–28.4%Mordor / IMARC / market.us, 2025
ElevenLabs valuation (Feb 2026, Series D)$11 billionBloomberg, 2026
ElevenLabs ARR (April 2026)$500 millionSacra / TechCrunch, 2026
ElevenLabs total funding (5 rounds at Series D)$781 millionBloomberg / ElevenLabs, Feb 2026

Valuation growth at ElevenLabs alone — from $1.1B (Jan 2024) to $3.3B (Jan 2025) to $11B (Feb 2026) — illustrates how fast capital is repricing the category. Total funding at the time of the Series D stood at $781 million across five rounds; subsequent tranches have brought this higher per tracker data. For a deeper feature breakdown of what “real-time voice cloning” actually means in 2026, see our voice cloning software guide.

2. Enterprise Adoption: Who Is Actually Using Voice AI

McKinsey’s November 2025 State of AI survey reframed the conversation: the question is no longer “is AI being adopted” but “is it generating returns.” Eighty-eight percent of organizations now use AI somewhere; only 5.5% report meaningful financial returns. Voice and conversational interfaces are among the most common use-case categories — and high-performer organizations are 3.6× more likely than peers to pursue transformative redesigns rather than point-feature pilots.

MetricValueSource
Organizations using AI in ≥1 business function88%McKinsey, The State of AI 2025
Organizations regularly deploying generative AI71%McKinsey, 2025
Organizations using or experimenting with AI agents62%McKinsey, 2025
Organizations seeing real financial returns from AI5.5%McKinsey, 2025
High performers’ likelihood of transformative AI redesign3.6× peersMcKinsey, 2025
Voice AI as one of most common reported use casesConversational interfaces in top tierMcKinsey, 2025

Adoption is leading trust by a wide margin. Enterprises pilot the technology aggressively while consumers remain skeptical — that gap is the single biggest variable shaping 2026 product roadmaps. If you want to experiment without a cloud-API dependency, our how-to clone your voice with AI walkthrough covers the local workflow.

3. Voice Cloning Adoption by Industry

Gaming and healthcare are the fastest-growing verticals by CAGR, but media and entertainment dominate by revenue today. Customer support has the highest enterprise pilot rate but also the largest unresolved consumer-trust gap. Government voice cloning implementations jumped 64% in 2024, an unusually fast turnaround for the public sector, as ministries integrated synthetic voice into transit announcements, accessibility services, and contact centers.

IndustryIndicatorSource
Media & entertainmentLargest commercial segment by revenueMordor Intelligence, Voice Cloning Market Report 2025
Chatbots & voice assistants34% of total voice cloning market (2024)Mordor / market.us, 2024
Gaming33.7% CAGR — fastest-growing verticalMordor, 2025
Healthcare & life sciences31.9% CAGRMordor, 2025
Government implementations+64% YoY in 2024Mordor, 2025
Dubbing (cost & time savings)40% cost reduction, 60% faster cyclesCamb.ai / industry case studies, 2025
Audible AI narration launchMay 13, 2025 — 100+ synthetic voicesAudible / Publishers Weekly, 2025
Digital audio share of trade book sales12.2% (Feb 2025)AAP StatShot Report, 2025

Audible’s launch is the bellwether for legitimate commercial use. The platform began rolling out AI-narrated audiobook production to an invitational publisher group in May 2025, including translation and accent control — with Article 50 of the EU AI Act’s transparency obligations for synthetic-audio providers set to apply from August 2, 2026.

4. Fraud, Scams, and Security Risks

This is the section regulators read first, and the numbers justify the attention. Pindrop’s enterprise customer base saw deepfake voice activity surge 680% year over year in 2024, with contact-center fraud attempts up 1,300% (from roughly one attempt per month to seven per day). Voice-clone-enabled imposter scams are now the fastest-growing fraud subcategory in U.S. consumer-protection data. The technical barrier to launching an attack is low enough that detection — not prevention — has become the active research frontier.

YoY deepfake voice fraud increase (2024) Banking +149% Insurance +475% Deepfake activity (overall) +680% Contact-center attempts +1,300% Source: Pindrop, 2025 Voice Intelligence and Security Report. Contact-center bar truncated visually; actual length proportional to 1,300%.
Figure 2 — Deepfake voice fraud by sector. Pindrop attributes the +1,300% contact-center figure to a shift from roughly one fraud attempt per month to seven per day across its enterprise customer base.
MetricValueSource
FTC imposter-scam reports (2025)>1 millionFTC, 2025
FTC reported losses to imposter scams (2025)$3.5 billionFTC, 2025
FTC total fraud losses (2024)$12.5 billionFTC, March 2025
FTC total fraud losses (2025)$15.9 billion (record)FTC testimony, March 2026
Older adults losing $10K+ to impersonation scams+4× since 2020FTC, 2025
Combined losses by older adults losing $100K+$55M (2020) → $445M (2024) — 8×FTC, 2025
Pindrop deepfake voice activity (YoY)+680%Pindrop, 2025 Voice Intelligence & Security Report
Contact-center deepfake fraud attempts (YoY)+1,300% (~1/month → 7/day)Pindrop, 2025
Retail contact-center calls flagged as fraud1 in every 127Pindrop, 2025
Projected 2025 contact-center fraud exposure$44.5 billionPindrop, 2025
Average deepfake fraud exposure per contact center$343,000Pindrop, 2025
Synthetic voice fraud in insurance (2024)+475%Pindrop, 2025
Synthetic voice fraud in banking (2024)+149%Pindrop, 2025

Pindrop’s 680% number captures detected attack volume — the leading indicator security teams use to plan staffing and tooling — not necessarily successful fraud completions. The detection-evasion arms race is what makes voice authentication a contested category in 2026.

5. Latency and Quality Benchmarks

Latency claims in marketing copy obscure a wide spread. Tools advertising sub-100 ms latency typically run on cloud GPUs with first-token-only measurements; tools showing 250–500 ms on consumer hardware deliver more natural-sounding outputs in blind listening tests. Cartesia and ElevenLabs Flash v2.5 now ship at 40 ms and 75 ms time-to-first-audio respectively — well below the 300 ms threshold that matches the natural pause length in human conversation, beyond which delay becomes perceptible.

Real-time voice cloning latency (ms — lower is better) Cartesia 40 ms ElevenLabs Flash v2.5 75 ms Fish Audio S2 100 ms Smallest AI Lightning 100 ms Inworld Mini (P90) ~130 ms CosyVoice2-0.5B 150 ms 250 ms — natural flow 300 ms — perceptible Sources: Inworld 2026 voice AI benchmarks; SiliconFlow edge benchmarks; AssemblyAI latency guidance.
Figure 3 — Time-to-first-audio across leading models. Bars below the orange thresholds preserve a sense of natural conversational flow; bars approaching 300 ms start to feel like delay to most listeners.
MetricValueSource
Cartesia time-to-first-audio40 msInworld AI Voice Benchmarks 2026
ElevenLabs Flash v2.5 inference latency75 msInworld benchmarks, 2026
Fish Audio S2 TTFA (single H200 GPU)~100 msInworld, 2026
Smallest AI Lightning (10s of speech)100 msInworld, 2026
CosyVoice2-0.5B (edge / streaming)150 msSiliconFlow edge benchmarks, 2026
Inworld Mini end-to-end P90<130 msInworld, 2026
Human-perception threshold for natural conversational flow<250 msAssemblyAI / industry consensus, 2025
Natural conversational pause length~300 msAssemblyAI, 2025
LLM inference share of total voice-to-voice latency40–60%AssemblyAI / Inworld, 2026

For an apples-to-apples comparison of how local voice changers handle the latency-versus-quality trade-off, our Voicemod alternative comparison breaks down what cloud and on-device approaches each cost in milliseconds — and our latency explainer goes deeper on the engineering trade-offs.

6. Consumer Trust, Public Perception, and Regulation

In the U.S., 50% of adults say they are more concerned than excited about AI in daily life, while only 10% report being more excited than concerned (Pew Research, June 2025). The same surveys that show majority concern about voice-clone-fueled robocalls also show majority support for legitimate accessibility and entertainment uses. The regulatory response is fragmented: the U.S. has acted at the FCC level on robocalls and is moving on state-level deepfake laws; the EU brings voice cloning fully into the AI Act’s Article 50 transparency regime on August 2, 2026; and several Asian jurisdictions require explicit consent and disclosure.

MetricValueSource
Adults globally more concerned than excited about AI34% (median across 25 countries)Pew Research, Views of AI Around the World, October 2025
U.S. adults more concerned than excited about AI50% (June 2025)Pew Research, 2025
U.S. adults more excited than concerned10%Pew Research, 2025
Adults who think AI voices/avatars should require disclosure~50%CivicScience, 2025
McAfee survey scope7,054 adults across 7 countries (US, UK, FR, DE, JP, AU, IN)McAfee, 2023
Adults experiencing AI voice scam or knowing someone who did25%McAfee, The Artificial Imposter, 2023
Adults receiving an AI voice clone message~10%McAfee, 2023
Voice-scam recipients who lost money77%McAfee, 2023
Adults who could NOT reliably identify a cloned voice70%McAfee, 2023
Adults sharing voice data online ≥1× weekly53%McAfee, 2023
FCC ruling on AI-generated robocallsIllegal under TCPA (Feb 8, 2024)FCC, 2024
Maximum FCC fine per illegal AI robocall>$23,000FCC, 2024
Private right of action (per call)Up to $1,500FCC, 2024
EU AI Act Article 50 transparency obligations for synthetic audioApplies from August 2, 2026EU AI Act / European Commission, 2026
EU AI Act first Code of Practice on watermarkingDraft published December 17, 2025Cooley / European Commission, 2025

Most credible voice-AI tools shipped in 2025 and 2026 added audible watermarks, provenance metadata (C2PA), or both — even when not strictly legally required — because the EU AI Act’s draft Code of Practice signals that single watermarking techniques alone won’t be sufficient. A multi-layered approach (imperceptible pixel/audio watermarks plus logging and fingerprinting for verification) is now the compliance baseline.

Voice Cloning by the Numbers (Summary)

MetricValueSource
Voice cloning market (2025)$2.4–3.3 billionMordor / TBRC, 2025
Voice cloning market projection (2030)$9.6–10.8 billionMordor / IMARC, 2025
Voice cloning CAGR (2024–2030)26.0–28.4%Mordor / IMARC / market.us, 2025
ElevenLabs valuation (Feb 2026)$11 billionBloomberg, 2026
ElevenLabs ARR (April 2026)$500 millionSacra / TechCrunch, 2026
ElevenLabs total funding (at Series D)$781 million (5 rounds)Bloomberg / ElevenLabs, Feb 2026
Organizations using AI in ≥1 function88%McKinsey, 2025
Organizations regularly deploying gen AI71%McKinsey, 2025
Organizations seeing real financial returns5.5%McKinsey, 2025
Pindrop deepfake voice activity (YoY)+680%Pindrop, 2025
Contact-center deepfake fraud attempts (YoY)+1,300%Pindrop, 2025
Projected 2025 contact-center fraud exposure$44.5 billionPindrop, 2025
FTC imposter-scam losses (2025)$3.5 billionFTC, 2025
FTC total fraud losses (2024)$12.5 billionFTC, March 2025
FTC total fraud losses (2025)$15.9 billion (record)FTC testimony, March 2026
McAfee adults unable to identify cloned voice70%McAfee, 2023
McAfee adults with personal voice-scam exposure25%McAfee, 2023
FCC AI-robocall rulingFeb 8, 2024FCC, 2024
EU AI Act Article 50 appliesAugust 2, 2026EU AI Act, 2026
Cartesia time-to-first-audio40 msInworld, 2026
ElevenLabs Flash v2.5 latency75 msInworld, 2026
Pew global AI concern (median, 25 countries)34%Pew, October 2025

Methodology and Sources

We compiled this roundup by tracing each statistic to a Tier 1 primary source: government report, market research firm publication, peer-reviewed study, or original company disclosure. Where multiple firms reported different figures for the same metric (typically market size and CAGR), we cited each in context and noted the variance.

Primary sources cited:

Last updated: May 2026. We refresh this page quarterly as new annual reports are released (Pindrop, FTC, McKinsey, Pew, and Mordor all publish on different cadences — typically Q1 for FTC fraud data, late spring for Pindrop, autumn for McKinsey and Pew).

For practical context on how the latency and quality numbers above translate into a real Windows voice tool, see our free AI voice generator overview — it covers what local inference looks like outside the cloud-API model that most of this article’s data is centered on.

Try VoxBooster — 3-day free trial.

Real-time voice cloning, soundboard, and effects — wherever you already talk.

  • No credit card
  • ~30ms latency
  • Discord · Teams · OBS
Try free for 3 days