Voice Cloning for ALS Patients: Preserve Your Voice Before It’s Gone
ALS voice clone technology has shifted from experimental research to a practical, accessible option for patients and families facing the progression of amyotrophic lateral sclerosis. The core idea is straightforward: record your natural voice while you still have it, use AI to build a synthetic model from those recordings, then deploy that model in augmentative and alternative communication (AAC) devices so you continue to sound like yourself — not a generic text-to-speech robot — as speech ability declines.
This guide covers who provides voice banking for ALS patients, what the process actually involves, how cloned voices integrate with AAC hardware, and what to do if progression has already advanced.
TL;DR
- Voice banking should start as soon as possible after ALS diagnosis — ideally before speech is noticeably affected.
- Major programs: ProjectRevoice (free, ALS-focused), Acapela MyOwnVoice, ModelTalker.
- Cloned voice profiles can be loaded into AAC devices including Tobii Dynavox and EyeGaze systems.
- Recording quality and timing matter more than quantity of hours — clear, early recordings outperform large volumes of impaired speech.
- Reconstruction from existing recordings (videos, voicemails) is possible but yields variable results.
- AI voice cloning also preserves family connection — a voice that carries 30 years of personality is irreplaceable.
Why ALS Voice Preservation Matters
ALS — amyotrophic lateral sclerosis, also called Lou Gehrig’s disease — is a progressive neurodegenerative condition that affects the motor neurons controlling voluntary muscle movement. For most patients, this includes the muscles of speech: the tongue, lips, jaw, soft palate, and larynx. Dysarthria (speech impairment due to muscle weakness) and eventually anarthria (complete loss of functional speech) are among the most emotionally difficult consequences of the disease.
The traditional alternative has been text-to-speech synthesis using generic synthesized voices. While functional, these voices carry none of the patient’s identity — the rhythm, warmth, regional accent, and timbre that family members and friends have known for decades. When a husband with ALS tells his wife he loves her using a generic computerized voice, something fundamental is lost. When he says those same words in his own voice, synthesized by AI from recordings made two years earlier, the connection is preserved.
This is the human case for ALS voice banking, and it goes beyond communication utility into something closer to dignity and identity preservation.
The technical case is equally compelling. Modern AI voice synthesis can produce speech that is statistically indistinguishable from natural speech in controlled listening tests, provided the training data is sufficient in quality and quantity. For ALS patients who begin banking early, the output is genuinely their voice — not a rough approximation of it.
Understanding Voice Banking: What It Is and How It Works
Voice banking is the structured process of recording a large corpus of your natural speech so that a text-to-speech engine or AI voice cloning system can learn your voice’s specific acoustic characteristics. The resulting model can then generate new speech — words and sentences you never actually recorded — in your voice.
The traditional approach (used by ModelTalker and similar tools) requires recording a prescribed set of sentences — often 1,600 or more — designed to cover phonetic diversity. The sentences include every consonant cluster, vowel combination, and prosodic pattern that the synthesis engine needs to generalize across new text. This approach is well-tested and produces reliable results, but it requires significant time commitment, often spread across many sessions over weeks or months.
The modern AI cloning approach uses deep learning models that can generalize from smaller datasets. Where traditional concatenative synthesis needed every phoneme explicitly recorded, neural voice synthesis learns abstract acoustic representations and can generate novel sounds from fewer examples. Some systems now produce acceptable output from 30–60 minutes of clean audio, though an hour of well-recorded speech almost always outperforms a day’s worth of impaired recordings.
The ALS-specific challenge is that the window for capturing high-quality speech narrows as the disease progresses. Recordings made when speech is already noticeably slurred, low in volume, or losing prosodic range produce a synthetic voice that inherits those impairments. The goal is always to record as early as possible, when the voice still sounds natural.
The Three Main Voice Banking Programs for ALS
ProjectRevoice
ProjectRevoice is a free program specifically created for people living with ALS. It was founded with backing from the ALS Association and has helped hundreds of patients preserve their voices. The program pairs patients with voice banking volunteers and speech-language pathologists who guide them through the recording process.
ProjectRevoice uses AI voice cloning technology — not concatenative synthesis — which means the recording requirement is more manageable than older methods. The resulting voice profiles integrate with common AAC platforms. The program also emphasizes ease of use for patients who may be dealing with the emotional and practical weight of a recent diagnosis.
For families in the United States, ProjectRevoice is typically the first recommendation from ALS clinics. The ALS Association’s chapter network can connect patients to the program and provide support through the process.
Acapela MyOwnVoice
Acapela Group is a commercial voice technology company with a strong assistive technology track record. Their MyOwnVoice program allows individuals to create a personal synthetic voice from recordings, with packages designed for people who need the voice for AAC use.
Acapela offers both a standard recording pathway (hundreds of sentences in their proprietary recording interface) and an abbreviated pathway for patients with limited recording capacity. The resulting voice is stored in their system and can be exported to compatible AAC software. Acapela voices integrate with Tobii Dynavox devices, among others.
Pricing and subsidized options vary by country and situation. For ALS patients in Europe and Australia, Acapela often has local partnerships that reduce or eliminate costs. Contact their assistive technology team directly for current options.
ModelTalker
ModelTalker, developed by researchers at the University of Delaware, is one of the longest-running voice banking systems. It is free to use and has an extensive track record with ALS and other motor neuron disease patients.
The system asks users to record a large set of carrier sentences — historically around 1,600, though the platform has options for shorter banking — through a web-based recording interface. Once complete, the system builds a personalized synthesis voice that can be used in their free SpeakIt app or exported for use in other AAC systems.
ModelTalker’s main advantage is its established research base and well-understood output quality. Its main limitation is the recording burden — 1,600 sentences is a significant commitment, particularly for patients experiencing fatigue or early speech impairment. The phased approach (banking in installments over weeks) is the recommended workaround.
Comparison of Voice Banking Programs
| Program | Cost | Recording Requirement | AAC Integration | AI Cloning | ALS-Specific |
|---|---|---|---|---|---|
| ProjectRevoice | Free | Moderate (AI-based) | Yes | Yes | Yes |
| Acapela MyOwnVoice | Subsidized/paid | Moderate to high | Yes (Tobii Dynavox, others) | Yes | No (general assistive) |
| ModelTalker | Free | High (1,600+ sentences) | SpeakIt app + export | No (concatenative) | No (general) |
| VoxBooster | Free trial | Short (30–60 min) | Via audio export | Yes | No (general) |
VoxBooster is primarily designed for real-time voice changing and creative voice cloning, but its AI engine can produce personal voice profiles from limited recordings. It is not a clinical AAC pipeline — it does not replace ProjectRevoice or Acapela for dedicated AAC integration — but for patients who want to create a personal voice for use in family communication, narration, or recording messages for loved ones, it offers an accessible entry point without a lengthy process. See also our guide on voice cloning for voiceover production for context on what AI voice synthesis can produce.
When to Start: The Critical Timing Window
The single most important advice from speech-language pathologists who specialize in ALS: start voice banking immediately after diagnosis.
This is not alarmist — it is logistical. Voice banking takes time, and disease progression can outpace a delayed banking schedule. Patients who begin when speech intelligibility is above 95% have ample time to produce excellent recordings across multiple sessions. Patients who delay until speech is already noticeably affected often wish they had started sooner.
Speech intelligibility benchmarks for voice banking:
| Intelligibility Level | Recommended Action |
|---|---|
| 95–100% | Start banking immediately. This is the optimal window. |
| 85–95% | Still good. Prioritize sessions, aim for 2-3 per week. |
| 70–85% | Possible but recordings will show some impairment. Begin today. |
| Below 70% | Cloning from new recordings becomes difficult. Look at reconstruction from existing recordings (videos, voicemails). |
Your speech-language pathologist can measure intelligibility formally. The Western Aphasia Battery and ASHA’s Functional Communication Measure are commonly used tools.
Fatigue is the enemy of recording quality. Sessions should be 20–30 minutes maximum, scheduled when the patient’s energy and voice are at their daily peak — typically mid-morning for most people. Avoid recording after meals, during illness, or at end-of-day when vocal fatigue reduces quality.
Integrating a Cloned Voice with AAC Devices
A cloned voice profile is only useful if it can actually produce speech when the patient selects words or phrases on their AAC device. Integration varies by platform and voice banking program.
Tobii Dynavox
Tobii Dynavox is the market leader in eye-tracking AAC devices. Their Snap and Compass software supports custom voice profiles. Voices created through compatible banking programs — including Acapela-compatible exports — can be loaded as the device’s TTS voice, so eye-gaze communication outputs speech in the patient’s own voice.
The Tobii Dynavox integration requires that the voice profile be in a compatible format. Not all AI cloning outputs are compatible without conversion. Your speech-language pathologist or an assistive technology specialist can guide the technical setup.
EyeGaze Systems
EyeGaze (LC Technologies) devices also support custom TTS voice integration, though compatibility depends on the specific software version. The patient’s voice is selected in the AAC software settings, and new text input is synthesized using the custom voice profile.
Grid-Based AAC Apps (Snap Core First, TouchChat, Proloquo2Go)
These tablet-based AAC applications support custom TTS voices through SAPI-compatible or platform-specific voice engines. Some accept voice profiles from Acapela and similar vendors directly. Check the app’s documentation for supported voice import formats.
The Gap Between What Exists and What Patients Need
One honest observation: the technical pipeline from “AI voice clone” to “working AAC voice” is not always smooth. Clinical voice banking programs have invested specifically in this integration problem. General-purpose AI voice cloning tools — including many commercial services — may produce excellent audio but not export in formats that plug directly into AAC devices.
This is why clinical programs like ProjectRevoice exist. They solve not just the AI modeling problem but the integration problem. General-purpose voice cloning tools fill a different need: creating a voice for family messages, audio recordings, memorial content, or informal communication that doesn’t route through an AAC device.
Voice Cloning When Speech Has Already Declined
Not every ALS patient hears about voice banking in time. For patients who have already experienced significant speech loss, two options exist.
Reconstruction from Existing Recordings
Home videos, voicemails, phone recordings, birthday speeches, professional recordings, or any audio where the person speaks clearly can serve as source material. AI voice synthesis systems can train on this material, though quality varies dramatically based on:
- Audio quality (phone-recorded voicemails are often noisy)
- Recording length (more is better; a 20-second voicemail yields poor results)
- Speaking style consistency (narrated speech works better than casual conversation)
- Background noise levels
Some services specialize in voice reconstruction from limited materials. The output is rarely as natural as a purpose-recorded banking corpus, but even an imperfect reconstruction can carry emotional weight for family members — the rhythm, the accent, the characteristic phrasing is still there.
For ALS families thinking about voice preservation for memory and connection rather than active AAC use, our related guides on voice cloning for grief and memorial audio and voice cloning for dementia and familiarity audio explore this dimension in more detail.
Modified Banking with Impaired Speech
If some speech remains, banking is still worth attempting. Speech that scores 60–70% on intelligibility can still produce a usable synthetic voice, particularly for frequently used phrases and family communication — it just won’t generalize as cleanly to novel text. A pragmatic approach: bank a core set of frequently used phrases (expressions of love, daily need requests, emotional responses) rather than trying to build a fully generative voice model. Even a phrase-based system in the patient’s own voice is significantly more personal than a generic TTS voice.
The Emotional Dimension: Voice as Identity
This is not a clinical topic, but it belongs in any honest discussion of ALS voice banking.
A person’s voice is one of the most identity-linked aspects of their existence. It carries accent, personality, emotional range, humor, and history. Spouses who have heard the same voice for 30 or 40 years recognize it the way they recognize a face. Children of ALS patients — particularly young children — may grow up with few natural recordings of their parent’s voice.
Voice banking, when successful, preserves that identity. It allows an ALS patient to:
- Continue speaking in family conversations with a voice that sounds like them
- Record messages for children and grandchildren to open years later
- Maintain a sense of self during a period when the body is changing rapidly
- Communicate emotion through a voice with their characteristic warmth and cadence, not a generic robotic voice
The practical value of AAC communication is obvious. The emotional value of sounding like yourself while doing it is harder to quantify but arguably more important.
For families creating audio messages or recordings for the future — not necessarily for active AAC use — tools like VoxBooster can generate voice content in the preserved voice from written text. The output can become narration for family videos, personal audio diaries, or messages to be delivered at future milestones. Our guide on personalized sleep stories created with voice cloning shows one creative application of this capability.
Recording Best Practices for ALS Voice Banking
Regardless of which program you use, recording quality matters enormously. These practices apply universally.
Equipment:
- Use a USB condenser microphone rather than a built-in laptop mic. A dedicated mic placed 6–8 inches from the mouth reduces room noise and captures fuller frequency response.
- Record in a quiet room. Avoid kitchen appliances, HVAC noise, or traffic-heavy windows.
- Record WAV files at 44.1 kHz or 48 kHz, 16-bit minimum. Do not record in MP3 — lossy compression at source reduces voice model quality.
Recording sessions:
- 20–30 minutes per session maximum. Vocal fatigue degrades recording quality and the model learns from fatigued speech.
- Schedule sessions when energy is highest — typically mid-morning.
- Speak at natural conversational volume and pace. Do not “perform” or exaggerate clarity — the AI trains on how you actually talk.
- Record on multiple days. Varied acoustic conditions across sessions actually improve model generalization.
What to record:
- All required sentences from the banking program’s prescribed list
- Additional personal phrases: names of family members, frequently used expressions, terms of endearment
- A short free-speech segment (read a passage or speak naturally for 5 minutes) to capture natural prosodic variation
Technical setup:
- Disable automatic gain control (AGC) in your recording software — it compresses dynamics in ways that confuse voice models
- Aim for peaks around -12 to -6 dBFS on your meter
- Listen back to the first 60 seconds before committing to a full session — better to catch a buzzing air conditioner before recording 300 sentences than after
How General-Purpose AI Voice Cloning Compares
Beyond the specialized ALS banking programs, general-purpose AI voice cloning tools — including VoxBooster, ElevenLabs, Resemble AI, and others — have matured significantly. Some ALS patients and families use these tools alongside or instead of dedicated banking programs, particularly for use cases outside of AAC device integration.
The key differences:
| Factor | Specialized ALS Programs | General-Purpose AI Cloning |
|---|---|---|
| AAC device integration | Native, tested | Manual/variable |
| Clinical speech-language support | Yes | No |
| Recording guidance | Structured, prescribed | Self-directed |
| Cost | Free/subsidized | Varies; often free tier available |
| Output naturalness | High (purpose-built) | High (improving rapidly) |
| Use case | AAC communication | Creative, family, memorial |
| Insurance/funding eligibility | Sometimes covered | Rarely covered |
For patients who want a voice for family messages, recorded narration, or creative purposes — but not necessarily for AAC device integration — general-purpose tools offer a faster, more flexible path. The AI voice cloning technology in these tools has now reached quality levels that make the output genuinely personal and emotionally resonant, not just technically functional.
If you are exploring this for a family member who is interested in voice cloning for broader creative or therapeutic purposes — for example, the way voice cloning is used to support people with communication challenges from other causes — our article on voice cloning for stuttering therapy contexts provides a useful adjacent perspective.
Practical Steps: Getting Started This Week
If you or someone you know has an ALS diagnosis, here is the practical starting sequence:
-
Contact ProjectRevoice (projectrevoice.org) and request enrollment. They will guide you through their process at no cost and connect you with a speech-language pathologist if you do not already have one.
-
Ask your neurologist for a referral to an ALS clinic with an SLP who specializes in AAC. This is a clinical need, not a luxury — SLPs who work with ALS patients know the banking programs, the AAC devices, and the integration steps.
-
Set up a basic recording environment this week. You do not need to wait for the formal banking process to start capturing your voice. Record 30 minutes of casual conversation, read a few passages, narrate a family story. These recordings have value regardless of which formal banking program you later use.
-
Inventory existing recordings. Go through phone videos, voicemails, old videos, any recordings where your voice is prominent and clear. Back these up in multiple places. If formal banking is not possible later, these become your reconstruction source material.
-
Talk to your ALS Association chapter. They often have funding to cover equipment costs (USB microphone, recording software) and can connect you to volunteers who help with the recording process.
-
Don’t delay waiting for the “right time.” There is no right time — there is only now, and later. For voice banking, now is always better.
Conclusion
ALS voice preservation is one of the most meaningful applications of modern AI voice technology. The ability to bank a voice before speech declines — and then deploy it through AAC devices so that a person continues to sound like themselves through years of communication — represents a genuine improvement in quality of life and dignity.
The key practical points: start as early as possible, use established programs like ProjectRevoice and Acapela MyOwnVoice for AAC-integrated voice banking, record in quality conditions with proper equipment, and layer general-purpose AI voice cloning tools for family and memorial use cases that fall outside clinical pipelines.
Tools like VoxBooster can complement this process — generating speech in a preserved voice for recorded messages, family narrations, or personal projects — without replacing the clinical pathway for AAC device integration. A 3-day free trial is available with no credit card required, if you want to explore what the technology can produce from a recording sample.
The voice that matters is the one that belongs to the person. Every week of voice banking that happens earlier is a better voice model that serves the patient and family for the rest of their lives together.
Frequently Asked Questions
What is ALS voice banking and why does it matter?
ALS voice banking is the process of recording your natural voice before disease progression causes significant speech impairment. Those recordings are then used by AI systems to generate a synthetic clone of your voice, which powers AAC (augmentative and alternative communication) devices. Starting early — while your voice is still strong — produces dramatically better results.
When should an ALS patient start voice banking?
As early as possible after diagnosis — ideally before speech becomes noticeably affected. Most speech-language pathologists recommend starting when intelligibility is still above 90%. Voice quality degrades over months, and AI cloning models trained on clear speech produce far more natural output than those trained on already-impaired recordings.
Is voice banking free for ALS patients?
Several programs offer free or subsidized voice banking specifically for ALS and other motor neuron disease patients. ProjectRevoice provides completely free voice banking with a focus on ALS. Acapela MyOwnVoice and ModelTalker also offer free pathways. Check with your local ALS Association chapter for additional funding resources.
Can a cloned ALS voice work with Tobii Dynavox and other AAC devices?
Yes. Most professional voice banking programs export voice profiles in formats compatible with major AAC platforms including Tobii Dynavox, EyeGaze systems, and grid-based communication apps. Confirm export format compatibility with your speech-language pathologist before choosing a banking program, as integration steps vary by device.
How many hours of recording does voice banking require?
Requirements vary by program. ModelTalker traditionally asks for 1,600 sentences. Acapela MyOwnVoice needs significantly fewer hours but still benefits from extended sessions. Newer AI cloning approaches can work with as little as 30–60 minutes of clear speech, though more data always yields more natural output. Spread sessions across multiple days to avoid vocal fatigue.
What if an ALS patient has already lost their natural voice?
If recordings of the person’s natural voice exist — home videos, voicemails, interviews, audio messages — these can sometimes be used as source material for reconstruction, though quality varies. Some services specialize in voice reconstruction from limited samples. Family-recorded AI memorial voices serve a different but related purpose for families who want to preserve connection.
Can ALS patients use voice cloning for real-time communication?
Yes, with modern AAC integration. A synthesized voice profile can be loaded into AAC software so that when the patient selects words or phrases — using eye tracking, switch access, or other input methods — the output uses their cloned voice rather than a generic synthesizer. This preserves voice identity in everyday conversation.