Voice Banking for Medical Patients: Preserve Your Voice Before Surgery

Voice banking lets ALS, MND, and laryngectomy patients preserve their natural voice before losing it. Learn timing, tools, and how modern AI cloning cuts recording time.

Voice Banking for Medical Patients: Preserve Your Voice Before Surgery

Voice banking for ALS patients — and for anyone facing surgery or disease that may permanently alter or eliminate their natural speech — is one of the most time-sensitive medical decisions a person can make. This guide covers everything: what voice banking is, who should consider it, exactly when to start, how much audio you need, which services to use, what insurance covers, and how modern AI voice cloning has changed the timeline from months to hours.

If you or someone you care for has been diagnosed with ALS, MND, laryngeal cancer, or another condition affecting the voice, the most important thing to know is this: start recording as soon as possible, before any change in speech quality is noticeable.


Key Takeaways

  • Voice banking preserves your unique vocal identity for use in text-to-speech communication devices after you can no longer speak naturally.
  • The ideal time to start is immediately after diagnosis, before any dysarthria (slurred or weakened speech) develops.
  • Traditional services need 1-3 hours of prompted recordings; modern AI cloning can work with as little as 5-15 minutes.
  • Free programs exist through the ALS Association, ModelTalker, and hospital AAC clinics.
  • Medicare Part B covers speech-generating devices; the voice banking process itself is often free.
  • Message banking — recording personal phrases and emotional expressions — complements voice banking for legacy purposes.

What Voice Banking Actually Is

Voice banking is the process of recording a significant sample of your natural speech so that a computer system can learn to replicate your voice’s unique characteristics — pitch, rhythm, timbre, accent, and personality. The resulting model powers a text-to-speech (TTS) system: you type what you want to say, and the device speaks in your voice.

This matters deeply because communication is identity. The flat, robotic default voices of early AAC (augmentative and alternative communication) devices were functional but felt impersonal to many users and to their families. A banked voice says “this is still me speaking” — and for people who lose speech slowly, as ALS causes, that continuity has real psychological and social value.

Voice banking is distinct from but closely related to message banking, where you record specific phrases you actually use (“I love you,” “I need more pain medication,” “that was a good one”) in your own voice, without any synthesis involved. The two approaches complement each other and are not mutually exclusive.

Who Should Consider Voice Banking

The primary candidates are people with conditions where speech loss is a known or likely outcome:

  • ALS (Amyotrophic Lateral Sclerosis) / MND (Motor Neurone Disease) — the most common indication; approximately 25% of ALS patients present with bulbar-onset ALS, meaning speech and swallowing are affected first.
  • Laryngectomy patients — people undergoing surgical removal of the larynx due to laryngeal cancer or severe trauma. The surgery is often planned weeks in advance, which is a meaningful window for recording.
  • Progressive bulbar palsy — a variant of MND that affects the brainstem directly, accelerating speech deterioration.
  • Multiple sclerosis (MS) — in some cases where speech is expected to deteriorate.
  • Parkinson’s disease — for patients with significant speech effects, though the progression is slower and the window longer.
  • Pre-surgical patients — anyone scheduled for throat, tongue, or jaw surgery who faces a significant chance of changed or lost voice as an outcome.

The common thread: the person still has a clear, strong voice now, but has reason to believe that will change. If you are in this group, the time to act is not “eventually” — it is this week.

The Right Time to Start: Earlier Than You Think

The most consistent advice from speech-language pathologists (SLPs) who specialize in AAC is: start banking within the first few weeks of an ALS diagnosis, not when you notice your voice changing.

By the time you notice a difference — speech that feels more effortful, slightly slurred consonants, reduced volume — the recordings will already show those characteristics. Synthesis models trained on dysarthric speech produce dysarthric synthetic voices. That output may still be useful and deeply personal, but it will not sound like the voice the person had before illness.

A Rough Timeline for ALS Voice Banking

ALS StageSpeech StatusVoice Banking Action
Diagnosis (no speech symptoms)Normal, clear speechStart banking immediately — ideal window
Early bulbar symptomsSlightly reduced volume or rateBegin urgent banking; accept some limitation
Moderate dysarthriaNoticeable slurring, effortBanking still possible with accommodations; add message banking
Severe dysarthriaSpeech intelligibility significantly reducedFocus on message banking; AAC device fitting
AnarthriaUnable to produce intelligible speechUse existing banked voice or default AAC voice

For laryngectomy patients the math is simpler: surgery is scheduled, you have a defined window of days or weeks, and every day of clear recording before surgery is a gift to your future self.

How Much Audio You Actually Need

This varies significantly by the platform and technology being used.

Traditional Voice Banking (HMM/statistical synthesis)

Services like ModelTalker and VocaliD use older statistical speech synthesis methods that require large amounts of training data to produce recognizable, natural-sounding output:

  • ModelTalker: 1,600 prompted phrases, typically 2-4 hours of actual recording spread over multiple sessions. Free for ALS and related conditions.
  • VocaliD: Variable, but similar scale. They blend your voice with a surrogate speaker’s voice bank, which means even a smaller set of recordings contributes to the final model.

These platforms provide scripted prompts — sentences chosen to cover all the phonemes and phoneme combinations in English. You read the prompts aloud into a microphone, the platform records them, and over weeks or months the model is built. The process is designed to be done in sessions of 15-20 minutes to avoid vocal fatigue.

Modern AI Voice Cloning

Neural voice synthesis has changed the required audio volume dramatically. Platforms using modern transformer-based or diffusion-based voice models can produce a usable personal voice from:

  • 5-15 minutes of clean, diverse audio: a serviceable voice for basic TTS use
  • 30-60 minutes: noticeably more natural, better at capturing your specific accent and speech patterns
  • 2+ hours: the best results, closest to your natural voice across a wide range of phonetic contexts

The trade-off is that these platforms are often commercial products rather than free medical programs, though several accessibility-focused options have emerged.

Practical Recording Guidance

Regardless of the platform, good source audio matters more than quantity:

  • Record in a quiet room with minimal echo (a bedroom with soft furnishings works well)
  • Use a USB condenser microphone if possible; a laptop’s built-in mic is acceptable in a pinch but will capture more room noise
  • Keep the microphone 6-8 inches from your mouth, off-axis slightly to reduce plosives
  • Record in short sessions (15-20 minutes) to avoid vocal fatigue that changes your voice quality
  • Speak at your natural pace and volume — do not try to speak more clearly than usual; you want the model to learn your actual voice
  • Stay consistent across sessions: same microphone, same room, similar time of day

Voice Banking Services: A Practical Comparison

Free and Subsidized Options

ModelTalker Developed by the Nemours Speech Research Laboratory (now part of Nemours Children’s Health), ModelTalker is free for patients with ALS and related neurodegenerative conditions. It provides 1,600 scripted prompts through a dedicated recording app (Windows). The resulting voice works within AAC devices compatible with the ModelTalker format. The process requires patience — 2-4 hours of recording across many sessions — but the price and the medical focus make it the default recommendation for ALS patients without technical barriers. Website: modeltalker.org

VocaliD VocaliD combines your voice recordings with recordings from a voice donor who shares your basic vocal characteristics (same sex, similar age, similar pitch). Even a small amount of your recordings is blended into the final model, giving it your vocal identity even if you could not complete a full recording set. VocaliD’s Human Voicebank project accepts donations from healthy speakers. The service has partnerships with several AAC device manufacturers. Website: vocalid.ai

ALS Association Voice Banking Program The ALS Association has partnered with VocaliD to offer voice banking at no cost to people living with ALS. Contact your local ALS Association chapter or the national organization for current availability. Hospital-based AAC clinics affiliated with ALS care centers often provide facilitated voice banking sessions as part of the care team’s services.

Consumer and Semi-Clinical Options

Apple Personal Voice (iOS/macOS) Introduced in iOS 17, Apple Personal Voice allows any user with a compatible iPhone, iPad, or Mac to create a synthetic version of their own voice by recording approximately 150 phrases (about 15-20 minutes). The model runs entirely on-device, requires no internet for synthesis, and integrates with the system-level Live Speech feature (type to speak). It is free, private, and designed explicitly with ALS in mind — Apple has stated publicly that accessibility for people who may lose their voice was the primary motivation for the feature. The limitation is Apple ecosystem only: it does not transfer to Android or Windows AAC devices.

Acapela My-own-voice Acapela Group, a longtime AAC voice provider, offers a service where you record approximately 50 sentences (around 15 minutes) and receive a professional-quality TTS voice compatible with most major AAC platforms (Tobii Dynavox, Prentke Romich, and others). It is a paid service but is often covered by AAC device funding. The resulting voice can be licensed for use across multiple devices. Website: acapela-group.com

Comparison Table

ServiceCostRecording TimePlatform CompatibilityMedical Focus
ModelTalkerFree2-4 hoursModelTalker-compatible AACALS/MND specific
VocaliD / ALS Assoc.Free (ALS)VariableMajor AAC platformsALS focused
Apple Personal VoiceFree~15-20 minApple devices onlyGeneral (AAC-motivated)
Acapela My-own-voicePaid (insurance)~15 minMost major AAC platformsClinical AAC
AI cloning platformsVaries5-60 minVariesGeneral

Medicare, Insurance, and Funding

Speech-generating devices (SGDs) — the devices that use banked voices to produce speech — are covered under Medicare Part B as durable medical equipment (DME) when the patient has a documented medical condition requiring AAC and meets functional criteria assessed by a licensed SLP. The SGD itself (often a dedicated tablet device from Tobii Dynavox, Prentke Romich, or similar) typically costs $3,000-$10,000+; Medicare covers 80% after the deductible.

The voice banking process — the recording and model creation — is a separate matter:

  • ModelTalker and the ALS Association’s VocaliD program are free; no insurance question arises.
  • Apple Personal Voice is free as a software feature on Apple hardware the patient may already own.
  • Acapela My-own-voice and similar clinical services are often bundled into the AAC device funding; your AAC specialist should include it in the prior authorization documentation.
  • AI cloning platforms not affiliated with AAC device manufacturers are generally not covered by insurance; costs vary from $0 (some open-source options) to $50-200+ for commercial services.

Medicaid coverage varies by state but generally follows Medicare precedent for SGDs. Many states have additional AT (assistive technology) funding programs.

Private insurance: Coverage varies widely. Work with your neurologist, SLP, and insurance coordinator to document the medical necessity. The ALS Association and ALSA chapters have care service coordinators who have navigated this process many times and can advise.

Message Banking: The Human Layer

Voice banking creates a synthetic voice for ongoing communication. Message banking preserves actual recordings of you saying specific things — your real voice, your real laugh, your specific phrases — for playback as audio clips rather than synthesis.

The two serve different purposes:

  • A banked voice lets you type anything and have it spoken in your voice — open-ended communication.
  • Banked messages let you play back specific recordings with perfect fidelity to the original — intimate, personal, irreplaceable.

Message banking is lower technology (it’s essentially an organized audio library) and can be done very informally:

  • Record yourself reading to your children or grandchildren
  • Record stories from your life
  • Record phrases of endearment you use with specific people
  • Record yourself laughing, saying their names, expressing emotions
  • Record holiday or birthday messages for future years

Apps like PhraseIt and AbleNet simplify message organization for AAC use. Even a folder of smartphone voice memos, carefully labeled, is a valid starting point.

For patients with limited recording time or energy — or for whom the voice banking window has already partially closed — message banking often becomes the primary focus and can be deeply meaningful to families.

Family Planning and the Emotional Dimension

Voice banking is a practical medical task with significant emotional weight. The act of recording your voice while knowing it will be lost requires coming to terms with the prognosis at the same moment you are also organizing clinical care, legal affairs, and family communication. This is hard.

A few things that help:

Involve the care team early. An SLP experienced with AAC and ALS will have guided many families through this process. They know the questions, the emotional patterns, and the practical shortcuts. Ask for an AAC evaluation at diagnosis, not six months later.

Set small, achievable recording goals. Fifteen minutes per day is sustainable; trying to record everything in a weekend is not. Consistent short sessions produce better audio and cause less emotional exhaustion.

Include family members as recording partners. Having a spouse, child, or close friend run the recording sessions — giving prompts, noting when a take was clear — shares the emotional load and turns a clinical task into shared time together.

Be honest about the purpose, but also about the gift. Many patients find that what they are doing for themselves also becomes something they are doing for the people who love them. A grandchild who grows up hearing a grandparent’s banked voice reading them a story has something real and irreplaceable.

The Role of AI Voice Cloning in Medical Voice Preservation

Modern AI voice cloning has made voice preservation significantly more accessible for medical patients in two ways: less recording time required, and more natural-sounding output.

Where traditional synthesis needed 1-3 hours of prompted phrases to produce a recognizable voice, current neural voice models can learn your vocal characteristics from 5-15 minutes of diverse, natural-sounding speech. This is meaningful for ALS patients whose energy and voice quality may already be limited, and for laryngectomy patients working against a surgical deadline.

The naturalness improvement is also real. Statistical synthesis voices often sound slightly robotic or flat; neural voices trained on even a modest amount of audio can capture individual qualities — accent coloring, characteristic vowel sounds, speech rhythm — in a way earlier technology could not.

Tools like VoxBooster that offer AI voice cloning are primarily designed for real-time creative use — streaming, gaming, content creation — but the underlying technology is the same. For patients who want a voice preservation option outside the traditional AAC ecosystem (for example, to use on a Windows PC with a standard TTS reader), AI cloning tools represent a meaningful option.

See also our overview of how voice cloning intersects with accessibility for a broader look at AI voice tools in the context of disability and communication.

Practical Steps to Get Started This Week

If you are reading this following a recent diagnosis, here is a concrete action list:

  1. Contact your neurologist or care team and ask for a referral to an AAC-specialized SLP. Many major ALS care centers have one on staff.
  2. Register with ModelTalker (modeltalker.org) — free, and you can begin reading prompts within hours of signing up.
  3. Set up Apple Personal Voice if you are in the Apple ecosystem — the 150-phrase recording session takes about 20 minutes and can be done today.
  4. Contact your local ALS Association chapter and ask specifically about their voice banking resources and the VocaliD partnership.
  5. Start informal message banking now — record voice memos on your phone of stories, expressions of love, names, laughter. Label them clearly.
  6. Evaluate AI cloning options if you want to create a voice model usable on non-AAC platforms (Windows PC TTS, smart speakers, etc.).

Do not wait until the process feels urgent. The goal is to capture your voice while it is at its best.

Voice preservation touches on questions of consent and identity that are worth acknowledging briefly.

A person’s voice model — like their genetic data — is intensely personal. Consider:

  • Who controls access to the model after your death? Some services transfer ownership to your estate; others retain the model. Read the terms carefully.
  • What uses are you consenting to? Specifying in writing that the voice model is for personal AAC use, and not for any commercial, entertainment, or research purpose, is reasonable and many services support this.
  • Family decisions about a deceased person’s banked voice can be emotionally complex. Having an explicit written statement about your wishes — for use during life and after death — removes ambiguity.

Our post on voice cloning ethics in 2026 covers the broader landscape of consent and personal identity in voice AI, and our overview of voice cloning for memorials and legacy preservation addresses the specific question of how families approach post-mortem use of a loved one’s voice model.

Frequently Asked Questions

What is voice banking for ALS patients?

Voice banking is the process of recording a sufficient amount of your natural speech before disease progression affects your voice, so that text-to-speech systems can later reproduce your unique vocal identity. For ALS patients this typically means recording 1-3 hours of prompted phrases while the voice is still strong and clear.

How much audio do you need to record for voice banking?

Traditional services like ModelTalker and VocaliD require 1,600 to 3,200 prompted phrases (roughly 1-3 hours of clean recordings spread over several sessions). Modern AI voice cloning platforms can generate a usable personal voice from as little as 5-15 minutes of high-quality audio, though more always improves naturalness.

When should ALS patients start voice banking?

As early as possible after diagnosis — ideally before any noticeable change in speech clarity, volume, or speed. Most speech-language pathologists recommend starting within the first few weeks of an ALS diagnosis. Once dysarthria (slurred or weakened speech) develops, recorded audio quality drops significantly and the resulting synthetic voice will reflect those changes.

Is voice banking covered by Medicare or insurance?

Speech-generating devices (SGDs) that use a banked voice are generally covered under Medicare Part B as durable medical equipment once a patient meets functional criteria for AAC (augmentative and alternative communication). The voice banking process itself — recording and model creation — is often offered free through services like ModelTalker and the ALS Association’s program with VocaliD. Private insurance coverage varies; check with your neurologist or AAC specialist.

Can voice banking be done at home?

Yes. Services like ModelTalker, Apple Personal Voice, and Acapela My-own-voice are designed for home recording with a standard USB microphone or even a laptop’s built-in mic. A quiet room, consistent microphone placement, and short daily sessions (15-20 minutes) across several weeks produce better results than marathon recording sessions.

What is the difference between voice banking and voice cloning?

Traditional voice banking uses statistical methods (HMM-based speech synthesis) trained on hundreds of scripted phrases to build a custom TTS voice. Modern AI voice cloning uses neural networks that can model your voice from much shorter samples and produce more natural-sounding output. Both serve the same purpose — preserving your voice identity — but AI cloning is faster and often sounds more realistic.

What happens if I wait too long to start voice banking?

If dysarthria is already present, recordings will capture slurred or weakened speech, and the synthetic voice will reflect those characteristics. It may still be usable, but naturalness suffers. Some services offer “loud speech” protocols where patients with mild dysarthria record in a louder-than-normal voice to capture clearer phonemes. If speech is already severely affected, message banking — recording personal phrases, stories, and expressions of emotion — becomes the primary focus.

Conclusion

Voice banking is one of the most meaningful things a person facing voice loss can do — for their own communication needs, and for the people who love them. The technology exists, much of it is free, and the window in which it works best is right after diagnosis.

For ALS and MND patients specifically: do not wait. Contact your care team for an AAC referral this week, register with ModelTalker, and spend fifteen minutes today recording voice memos of the things you most want your family to be able to hear in your voice. The process will take weeks or months to complete properly; the first session can happen today.

For anyone facing scheduled voice-affecting surgery: your recording window is defined and finite. Prioritize this alongside the clinical preparation.

Modern AI voice technology — including tools like VoxBooster — has made it possible to preserve your voice with less time and less technical burden than ever before. The human reason to do it has not changed.

This post is informational and does not constitute medical advice. For guidance specific to your diagnosis, work with your neurologist and a speech-language pathologist with AAC specialization.

Try VoxBooster — 3-day free trial.

Real-time voice cloning, soundboard, and effects — wherever you already talk.

  • No credit card
  • ~30ms latency
  • Discord · Teams · OBS
Try free for 3 days