Voice Cloning for Accessibility TTS: Personal Voice for Devices

Accessibility voice cloning has moved from research lab to bedside table in the span of a few years. For people living with ALS, MND, laryngectomy, or any condition that progressively erodes the ability to speak, the ability to preserve and later use their own voice — not a generic robotic synthesizer — through a TTS device or smartphone is no longer a distant possibility. It is available today, and this guide explains how.

We will cover the technology clearly, compare the main platforms including Apple Personal Voice, Acapela My-own-voice, VocaliD, ElevenLabs, and VoxBooster, and give practical advice on timing, recording quality, and AAC device integration.

Key Takeaways

Voice banking should start early — before significant speech deterioration — to capture the best source material.
Apple Personal Voice (iOS 17+) offers free, on-device voice cloning for users in supported languages.
Professional AAC platforms (Acapela, VocaliD) provide high-fidelity models designed specifically for augmentative communication devices.
AI voice synthesis platforms (ElevenLabs, VoxBooster) offer faster turnaround and more flexible routing options.
A cloned voice can be used with AAC hardware, screen readers, virtual microphones, and TTS apps across Windows, iOS, and Android.
Voice cloning for elective surgery (e.g., laryngectomy for cancer treatment) is equally valid and should be planned pre-operatively.

What Is Accessibility Voice Cloning?

Accessibility voice cloning is the application of AI voice synthesis to create a personalized text-to-speech model based on recordings of a specific person’s voice. The resulting model allows that person to type text and have it spoken aloud in a voice that sounds like their own, rather than a generic synthesizer voice.

This matters for a straightforward human reason: identity. A person’s voice carries personality, regional accent, emotional color, and decades of relationships built on that sound. When a condition takes away the physical ability to produce speech, losing the voice’s character on top of the communication loss is a compounding grief. Cloning offers a way to preserve and restore that identity layer.

The technology underpinning this has shifted dramatically. Earlier concatenative voice banking systems stitched together phoneme recordings — functional, but robotic for novel sentences. Current neural TTS models learn the acoustic character of a voice holistically and can synthesize arbitrary text with natural prosody, intonation, and even some emotional coloring.

Who Uses Accessibility TTS Voice Cloning?

ALS and MND Patients

Amyotrophic Lateral Sclerosis (ALS) and Motor Neuron Disease (MND) are the most common diagnoses driving voice banking demand. The disease progresses at different rates, but bulbar-onset ALS can affect speech within months of diagnosis. Clinicians and charities consistently recommend beginning voice recording as soon as possible after diagnosis — ideally while speech is still 100% intelligible and without noticeable fatigue or slurring.

The Stephen Hawking Communication Centre and organizations like the Motor Neurone Disease Association provide guidance and sometimes financial support for this process.

Laryngectomy Patients

A total laryngectomy — surgical removal of the larynx, most often due to laryngeal or thyroid cancer — results in complete loss of natural voice. Unlike ALS, this is typically a scheduled surgery, which means pre-operative voice recording is both possible and strongly recommended. Patients who have recorded their voice before surgery can use a cloned TTS voice immediately post-operatively rather than starting from scratch with an electrolarynx or tracheoesophageal prosthesis alone.

For these patients, voice cloning is not a long-term project but a specific pre-surgical task with a hard deadline.

Spasmodic Dysphonia and Parkinson’s Disease

Spasmodic dysphonia causes involuntary spasms of the vocal cords, making speech effortful and inconsistent. Parkinson’s disease often leads to hypophonia (very quiet, soft speech) and dysarthria. Both populations may reach a point where TTS supplementation or replacement is preferable to struggling through spoken communication.

Recording while speech is still relatively clear is still the best strategy — a hypophonic Parkinson’s voice produces a weaker model than a pre-progression recording would have.

Elective Situations

Not all voice cloning for TTS use stems from medical diagnosis. Transgender individuals who have not yet undergone voice training may use a cloned voice as a preferred-gender TTS output while their natural voice develops. Public figures who want to create accessible audiobook or AI narrator versions of their voice use cloning for scalable TTS production. Teachers and communicators who rely heavily on their voice may bank it as a precaution.

Apple Personal Voice: On-Device Cloning for Everyone

Apple introduced Personal Voice in iOS 17 and macOS Sonoma (2023) as an accessibility feature that requires no subscription and processes entirely on-device. It is currently available for English (US, UK, Australian, Indian), Spanish, French, German, Italian, Korean, Mandarin, Cantonese, and Japanese.

How to Set Up Apple Personal Voice

Go to Settings > Accessibility > Personal Voice.
Tap Create a Personal Voice and follow the setup prompts.
You will be asked to read approximately 150 randomized phrases aloud — the same phrases used in each session to cover a broad phonetic range.
Each session can be as short or long as you want; the recording saves progress so you can complete it across multiple days.
When recording is complete, your device processes the model overnight while charging.
Enable Settings > Accessibility > Live Speech, select your Personal Voice, and you can type to speak in your own cloned voice from Control Center.

Live Speech integration means your Personal Voice is available across FaceTime calls, phone calls, and any other app that uses system audio — not just a standalone TTS app.

Apple’s on-device processing is significant: no audio leaves the device, there is no subscription fee, and the model is tied to your Apple ID for iCloud backup. The quality is impressive for a consumer-grade, on-device system, though it is not at the level of professional AAC platform output.

Limitations

English and a limited set of languages only (expanding over time).
Requires iPhone 12 or later, or Apple Silicon Mac.
No API access — you cannot pipe the voice to non-Apple apps.
150 phrases takes ~20–30 minutes of active recording; a fatigued speaker may need to spread this over several days.

AAC Devices and Professional Voice Banking Platforms

Augmentative and Alternative Communication (AAC) devices range from dedicated hardware (Tobii Dynavox, PRC-Saltillo devices) to software on iPad and Windows tablets. Most modern AAC systems accept custom synthetic voices via their software layer.

Acapela My-own-voice

Acapela Group’s My-own-voice service is one of the oldest and most widely used professional voice banking platforms. It has been specifically designed around the AAC workflow, with partnerships with major AAC device manufacturers.

Process: Users record a set of phrases (typically 50–200) through the web platform. Acapela’s team processes the model and delivers a voice file compatible with their Acapela Voice technology, which installs into Windows and outputs as a SAPI5 voice — natively compatible with most AAC software including Tobii Dynavox Communicator, Grid 3, and others.

Strengths: Direct AAC hardware and software integration, dedicated support for ALS/MND cases, high-quality output, SLP (speech-language pathologist) guidance available.

Limitations: Subscription or per-voice pricing; not free. Language support varies.

VocaliD

VocaliD takes a distinctive approach: if a person has too little usable audio of their own voice, VocaliD blends their existing recordings with a “surrogate” voice from the VocaliD HumanVoice Bank (donors who contribute voice recordings for this purpose). The blend can preserve some acoustic character from the patient even when only minutes of intelligible speech remain.

Process: Record what you can (even degraded speech is useful). VocaliD’s system creates a blended voice. Delivery as a SAPI5-compatible voice for Windows AAC software.

Strengths: Viable even with significant speech deterioration; voice donor community is large; designed specifically for AAC.

Limitations: Subscription model; the blended result is less “purely your voice” than a clean clone from earlier recording. US-centric support, though broader language coverage is growing.

Platform Comparison

Platform	Best For	Min. Recording	Output Format	Cost	On-Device?
Apple Personal Voice	iPhone/Mac users, iOS Live Speech	~150 phrases / 20 min	Apple Live Speech	Free	Yes
Acapela My-own-voice	AAC devices, professional SLP workflow	50–200 phrases	SAPI5 (Windows)	Paid	No
VocaliD	Limited speech remaining, donor blend	Any amount	SAPI5 (Windows)	Paid/subscription	No
ElevenLabs	Fast turnaround, app developers	~1 min of audio	API / web player	Free tier + paid	No
VoxBooster	Windows real-time routing, flexible apps	Minutes of audio	Virtual microphone	Paid (3-day trial)	No

ElevenLabs for Accessibility TTS

ElevenLabs has become the go-to for developers building accessibility apps, largely because of its API-first design and fast voice cloning (Professional Voice Cloning requires at least 30 minutes of clean audio; Instant Voice Cloning works from as little as 1 minute, with lower quality).

Use cases for accessibility:

Custom TTS apps for iOS or Android that call the ElevenLabs API to speak cloned voice output.
Integration into productivity tools (Notion voice readers, email readers).
Audiobook production using a preserved voice.
Accessible video content where the creator’s voice has changed or been lost.

Limitations: Audio is processed on ElevenLabs servers (not on-device), which is a privacy consideration for some users. Output is primarily through API calls or their web player — connecting it to Windows AAC software requires a custom bridge or virtual microphone routing.

Using VoxBooster for Accessible TTS Routing

VoxBooster is not purpose-built for medical AAC, but it plays a specific and practical role in the accessibility voice cloning pipeline: flexible routing on Windows.

The scenario: you have a cloned voice from ElevenLabs, a fine-tuned AI voice model, or another synthesis platform — but you need to pipe that voice output into a video call, a Windows dictation interface, or an AAC software package that expects microphone input rather than a SAPI5 voice.

VoxBooster’s virtual microphone output registers as a standard Windows audio input device. Any application that accepts a microphone — Zoom, Teams, Discord, Windows Speech Recognition, OBS — can receive the cloned voice as if it were a live microphone feed.

Practical workflow:

Train or upload your voice model in VoxBooster (short recording session, minutes of audio).
Type or dictate text; VoxBooster synthesizes it through your cloned voice model.
Select VoxBooster as the microphone input in any Windows app.
Your cloned voice appears in the receiving app in real time.

This is particularly useful for video calls and real-time communication where SAPI5 integration is not available, and for Windows users who want a single tool handling both voice effects and TTS routing without separate software stacks.

For users specifically focused on real-time communication with a disability-related voice change, our guide on voice changer accessibility for disabilities covers the broader picture of how real-time voice tools are used in assistive contexts.

Voice Preservation for Elective Surgery: A Pre-Op Checklist

If you are facing a laryngectomy or other procedure that will permanently alter your voice, pre-operative voice recording is a clear priority. Here is a practical framework:

At least 4 weeks before surgery:

Contact a speech-language pathologist familiar with AAC and voice banking. They can guide platform selection and phrase sets appropriate for your language and communication style.
Choose a platform based on your hardware (Apple ecosystem vs. Windows AAC device), budget, and language. Acapela My-own-voice and VocaliD have established clinical pathways; Apple Personal Voice is viable for iPhone users.
Record in a quiet room with a USB condenser microphone or a smartphone held 6–8 inches from the mouth. Avoid recording when tired, sick, or after alcohol — voice quality degrades in ways the model will preserve.
Record personal phrases first: your name, family members’ names, common greetings, your job title, emergency phrases. These are the sentences you will most want to sound like yourself saying.
Complete the platform’s phrase set in full — the randomized phonetic coverage is there for a reason; partial recordings produce weaker models.

Post-surgery:

Configure your chosen AAC or TTS platform to use your cloned voice.
Work with your SLP to integrate it into your AAC device or Windows TTS workflow.
Keep the original recordings archived — cloning technology is improving rapidly, and better models may be trainable from the same data in 2–3 years.

Custom TTS in Screen Readers

Blind and low-vision users who have a strong preference for their own voice — or who need a cloned voice for a specific reason (e.g., a VTuber maintaining a character voice, a user who wants gender-affirming TTS output) — can use a cloned voice with screen readers on Windows.

NVDA and SAPI5: NVDA (NonVisual Desktop Access), one of the most-used free screen readers, supports SAPI5 speech synthesizers. Any cloned voice exported as SAPI5 (Acapela, VocaliD) will appear as an option in NVDA’s synthesizer settings. Installation is typically a single MSI or executable install followed by selecting the voice from NVDA settings.

JAWS: JAWS supports SAPI5 and also has its own Vocalizer Expressive engine. SAPI5 voices from voice banking platforms are compatible.

Narrator (Windows built-in): Windows Narrator supports SAPI5 voices via Settings > Narrator > Choose a voice. Less flexible than NVDA or JAWS but works with any SAPI5 voice.

Virtual microphone bridge (VoxBooster route): For screen readers or apps that do not have flexible voice selection but do allow microphone input for dictation, VoxBooster’s virtual microphone output provides a workaround — the cloned voice enters any app through the microphone input path.

The Ethics of Voice Cloning for Accessibility

This topic deserves honest discussion. Voice cloning technology is powerful, and its accessibility applications are genuinely beneficial — but using another person’s voice without consent is harmful, regardless of the stated reason. Two points are worth stating directly:

Consent and ownership: A cloned accessibility voice is ethically grounded when the person being cloned has made informed choices about who can use the model, on what devices, and under what conditions. Family members or caregivers should not commission a clone of someone else’s voice without that person’s clear consent and involvement.

After death: Some families ask about using a deceased person’s voice model for memorial or therapeutic purposes. This is a separate, nuanced question explored in our post on voice cloning memorial ethics. The accessibility context is specifically about living users — the decisions should be theirs.

Medical device boundaries: An AAC voice is a communication tool, not a deepfake. Using a cloned accessibility voice to impersonate the person in contexts they have not authorized — financial transactions, legal declarations, social media — is a misuse that undermines trust in these tools broadly.

For a broader discussion of these issues see our piece on voice cloning ethics 2026.

Getting Started: Which Platform Is Right for You?

Situation	Recommended Starting Point
iPhone or Mac user, English speaker, limited budget	Apple Personal Voice — free, on-device, good quality
ALS/MND diagnosis, using Tobii Dynavox or Grid 3	Acapela My-own-voice — SLP-supported, SAPI5 output
Significant speech deterioration already present	VocaliD — donor-blend approach works with limited audio
Developer building an accessibility app	ElevenLabs API — fastest to integrate, strong documentation
Windows user needing flexible call/meeting routing	VoxBooster — virtual mic output, no kernel driver
Pre-laryngectomy, any platform	Start with Apple Personal Voice OR Acapela; record 4 weeks pre-surgery

The decision is not exclusive — many users bank their voice on multiple platforms, since the recording effort overlaps and having redundant models is a sensible precaution.

Internal Resources

If you are coming from a gaming or streaming background and are exploring voice cloning for the first time, our introduction to how to clone your voice with AI covers the technology from the ground up. For the specific medical context of voice banking for ALS and similar diagnoses, our in-depth piece on voice banking for medical patients goes further on clinical workflow, platform selection, and SLP coordination.

Frequently Asked Questions

What is accessibility voice cloning?

Accessibility voice cloning uses AI to create a synthetic version of a person’s voice from audio recordings. People with ALS, laryngectomy, or other conditions that affect speech use their cloned voice through AAC devices, screen readers, or TTS apps so they can continue communicating in a voice that sounds like them.

How many voice samples does Apple Personal Voice require?

Apple Personal Voice (iOS 17 and macOS Sonoma or later) requires you to read approximately 150 phrases aloud. The process takes 15–30 minutes in total and the model trains on-device, meaning your voice data never leaves your iPhone or Mac.

Can voice cloning work for someone who has already lost their voice?

Only if recordings of the person’s voice exist before the voice loss. This is why voice banking is strongly recommended as early as possible after a diagnosis of ALS, MND, or any progressive condition. VocaliD, Acapela My-own-voice, and similar services can build a model from 20 minutes to several hours of pre-recorded speech.

Is voice cloning for accessibility covered by insurance?

Some AAC devices and associated software qualify for funding through Medicare, Medicaid, or private insurance in the US, and through NHS assistive technology schemes in the UK. The cloning service itself is often a separate cost. Organizations like the ALS Association and MND Association sometimes provide grants. Always check with a speech-language pathologist who specializes in AAC.

What is the difference between voice banking and voice cloning?

Voice banking typically refers to recording a library of phrases that are spliced together phonetically to produce new sentences — a concatenative approach. Voice cloning builds a neural model from the recordings and can generate any text in a natural-sounding version of the original voice. Modern platforms blur this line, but cloning generally sounds more natural for novel sentences.

Some platforms expose a cloned voice as a SAPI5 (Windows) or NVDA-compatible speech synthesizer, allowing it to work with any screen reader or TTS-enabled application. Compatibility varies by provider. VoxBooster can route a cloned voice to any app through a virtual microphone, which is a flexible workaround when direct SAPI5 integration is unavailable.

How long does it take to clone a voice for accessibility use?

With modern AI voice synthesis, a usable model can be ready in minutes to a few hours from as little as 20–30 minutes of clean source audio. Apple Personal Voice takes processing time overnight on-device. Enterprise platforms for AAC often take 1–3 business days for quality review. The more clean audio provided, the more natural the result.

Conclusion

Accessibility voice cloning has become one of the clearest cases where AI technology delivers meaningful, human-centered value. Whether you are a person with ALS banking your voice before it changes, someone preparing for a laryngectomy, or a caregiver helping a family member set up AAC software — the tools are here, the process is documented, and the outcome is preserving a fundamental part of human identity.

The practical advice: start early, record clean audio, choose a platform matched to your device ecosystem, and work with a speech-language pathologist when possible. Personal Voice is the right answer for iPhone and Mac users who need a free starting point. Acapela and VocaliD are the professional choices for AAC hardware integration. ElevenLabs covers developer and app-builder use cases. VoxBooster fills the Windows routing gap when other tools do not connect directly to your applications.

If you want to explore what personal voice TTS looks like in a Windows environment — including how a cloned voice feeds into calls, streams, and accessibility software through a virtual microphone — VoxBooster offers a 3-day free trial with no credit card required. The voice model you create is yours, processing runs locally, and no kernel driver installation is required.

For the clinical side of voice preservation, read our detailed guide on voice banking for medical patients next.