AI Voice Generator for Gas Station Pay Pumps

Gas pump voice AI is the synthesized voice that walks you through every step at a pay-at-pump fuel dispenser — “Please insert your card,” “Select your grade,” “Lift the nozzle and begin fueling,” “Please take your receipt.” These prompts come from an embedded audio system built into Gilbarco Veeder-Root and Wayne Fueling Systems dispensers, the two hardware platforms that handle the majority of North American fuel retail. This guide covers how those prompts are built, what the complete audio set looks like, how Shell, BP, Chevron, and Petrobras approach voice branding, and how to produce professional-grade fuel pump audio with a modern AI voice generator.

TL;DR

Gas station pay pumps use synthesized voice AI to guide customers through card payment and fueling — insert card, select grade, take receipt.
Gilbarco Veeder-Root and Wayne Fueling Systems are the dominant dispenser hardware makers; their firmware plays WAV files loaded by the operator.
Shell, BP, Chevron, and Petrobras each maintain brand audio guidelines; franchise sites vary.
Multilingual pumps (English/Spanish/Portuguese) are standard at high-traffic locations in the US South, Southwest, and Latin America.
Audio production requires matching the low-bitrate WAV specs of embedded dispenser hardware — not just standard studio output.
VoxBooster’s AI voice engine can generate and export the full pump prompt set in any language, matched to hardware spec.

Why Gas Pumps Need Consistent AI Voice

Walk up to any self-service fuel dispenser in North America and the voice you hear is not a recording of a human employee — it is a synthesized prompt system embedded in the dispenser hardware. The practical reasons are straightforward. A fuel retail network may operate thousands of locations across multiple states or countries. A single prompt update — adding a new payment option, changing a safety warning, or refreshing a brand greeting — requires replacing audio files on thousands of units. That is only manageable if the audio was generated consistently from a script library, not sourced from one-off voice actor sessions.

The other driver is accuracy. Fuel pump prompts guide customers through a real-money transaction at a physical piece of outdoor equipment. Ambiguous or inaudible prompts create friction: customers who cannot understand whether the pump accepted their card, cannot identify the correct nozzle handle, or miss the receipt prompt end up going inside to speak with a cashier — which defeats the purpose of pay-at-pump entirely.

The networks that have invested in clear, well-produced AI voice prompts consistently see lower customer service interruptions, fewer pump aborts, and better throughput during peak hours. The audio is a small line item in a dispenser deployment budget and a disproportionately large factor in the customer experience.

Gilbarco Veeder-Root and Wayne Fueling Systems: The Hardware Platforms

Understanding fuel pump voice production starts with the hardware. In North America, two manufacturers dominate the forecourt dispenser market:

Gilbarco Veeder-Root (a Fortive company) produces the Encore, Edge, and Passport product lines. Their dispensers are widely deployed at Chevron, BP, and independent sites. Gilbarco’s embedded audio system plays pre-loaded WAV files from internal flash storage. The site controller (typically a Gilbarco Passport or a third-party POS integrated via the Gilbarco API) determines which audio triggers play at each transaction state.

Wayne Fueling Systems (an Enovis brand) produces the Ovation, Helix, and iXPay dispensers. Wayne hardware is dominant at many Shell, ExxonMobil, and large chain sites. Wayne dispensers similarly use a firmware audio library with WAV files, and the Wayne Nucleus cloud management platform allows operators to push audio updates remotely across a fleet.

Both platforms have legacy hardware in the field that accepts only 8 kHz or 16 kHz PCM WAV — a sample rate inherited from early 1990s dispenser hardware that reduced storage requirements. Newer generation hardware (Gilbarco Encore 700 S and Wayne Helix) supports 44.1 kHz, which dramatically improves voice quality. When producing prompts for a mixed fleet, it is safest to produce at 44.1 kHz and then downsample to 16 kHz for older units — downsampling preserves more quality than generating natively at 16 kHz.

Feature	Gilbarco Veeder-Root	Wayne Fueling Systems
Key models	Encore, Edge, Passport	Ovation, Helix, iXPay
Common networks	Chevron, BP, independent	Shell, ExxonMobil, chain
Audio format (legacy)	WAV PCM 16-bit, 8–16 kHz	WAV PCM 16-bit, 8–16 kHz
Audio format (new)	WAV 44.1 kHz (Encore 700 S)	WAV 44.1 kHz (Helix)
Remote audio update	Passport site controller	Wayne Nucleus cloud
Multilingual support	Yes, file-per-language	Yes, file-per-language

The Complete Fuel Pump Audio Prompt Set

A well-designed pay-at-pump audio system covers every transaction state. Below is a reference table for a full deployment. Note that the exact phrasing varies by network brand guidelines — what is shown here is the neutral generic form.

Prompt ID	Script (neutral English)	Trigger State
WELCOME	”Welcome. Please insert or tap your card.”	Customer approach / pump wake
CARD_INSERT	”Please insert your card into the slot.”	Card not yet detected
TAP_TO_PAY	”Tap your card or phone to pay contactlessly.”	NFC payment enabled, no card inserted
PIN_ENTRY	”Please enter your PIN and press Enter.”	Chip/PIN card detected
ZIP_ENTRY	”Please enter your billing zip code.”	Credit card ZIP verification
CAR_WASH	”Would you like to add a car wash today?”	Upsell trigger after auth
GRADE_SELECT	”Please select your fuel grade.”	Authorization approved
NOZZLE_LIFT	”Lift the nozzle and begin fueling.”	Grade selected
FUELING_START	”Fueling has begun.”	Nozzle flow sensor active
FUELING_STOP	”Fueling complete.”	Nozzle returned
RECEIPT_OFFER	”Would you like a receipt? Press Yes or No.”	Transaction close
RECEIPT_PRINT	”Please take your receipt.”	Receipt printing
NO_RECEIPT	”Thank you. Have a safe trip.”	No receipt selected
CARD_DECLINED	”Your card was not approved. Please try another card.”	Auth declined
PUMP_FAULT	”This pump is temporarily out of service. Please see the cashier.”	Hardware fault
NOZZLE_ERROR	”Nozzle not detected. Please hang up the nozzle and try again.”	Nozzle sensor fault

Producing all 16+ prompts from a single AI voice generator session ensures vocal consistency across the transaction. A customer who hears the welcome prompt in one voice and the receipt prompt in a noticeably different voice registers the inconsistency as a quality signal — subtle, but real.

Shell, BP, Chevron, and Petrobras Brand Audio Guidelines

The major oil networks each have voice brand standards that go beyond just choosing a voice gender. Here is how the four largest networks approach audio branding at the pump:

Shell maintains a global brand voice that emphasizes clarity and approachability. Shell-branded dispensers at company-owned sites use a neutral female voice with a moderate North American accent in the US. International Shell sites adapt the voice profile to regional standards but maintain the same friendly, non-pressured tone. Shell’s audio guidelines specify minimum intelligibility standards — the voice must score above a defined STIPA (Speech Transmission Index for Public Address) threshold on the outdoor speaker hardware at the forecourt.

BP (British Petroleum) uses a similarly neutral voice for their US network, often with slightly warmer intonation than competitor networks. BP’s global network spans enough regions that their audio team maintains language variants for North American English, UK English, German, Dutch, and several other markets. The consistency requirement — that a BP pump in Houston and a BP pump in Amsterdam both feel like the same brand — drives the use of AI voice generation rather than country-by-country voice actor casting.

Chevron (which also operates Texaco sites in many markets) takes a more functional approach to pump audio — the voice is clean and direct rather than notably warm or branded. Chevron’s dispenser audio has historically been one of the more conservative in the US market, prioritizing intelligibility over personality. Their bilingual English/Spanish requirement at California sites is among the more stringent in the North American market.

Petrobras operates the largest fuel retail network in Latin America, with thousands of sites across Brazil. Petrobras pump audio is primarily in Brazilian Portuguese (pt-BR), with a distinctly different phonetic profile from European Portuguese — the vowel sounds, prosody, and intonation contours are different enough that using an EU-PT voice model for Brazilian sites produces noticeably unnatural output. AI voice generators that support pt-BR natively are essential for this market, not a convenience.

Multilingual Gas Pump Audio: English, Spanish, and Portuguese

The most common multilingual requirement in North American fuel retail is English and Spanish. In states with large Spanish-speaking populations — California, Texas, Florida, Arizona, New Mexico — pump operators face both commercial pressure and regulatory requirements to offer Spanish-language prompts.

The register choice for Spanish pump audio follows the same convention as airline and banking IVR: formal “usted” rather than informal “tú.” A payment terminal that addresses the customer informally signals a lack of professionalism in the Latin American market; it is a subtle signal that reads as low-quality. All Spanish prompts for fuel dispensers should use the formal usted register and avoid regionally specific idioms that may not translate across Mexican Spanish, Caribbean Spanish, and South American Spanish variants.

Portuguese requirements are more specialized. US fuel retail generally does not require Portuguese at scale, but operators in South Florida (which has a large Brazilian community) and in any border-crossing or transit corridor context may deploy pt-BR as a third language. More significantly, any operator deploying Petrobras or other Latin American networks needs genuine pt-BR voice production — not Spanish with a vowel shift, not EU-PT, but properly accented Brazilian Portuguese with correct stress patterns.

The language detection pipeline at a multilingual fuel pump works like this:

The payment terminal reads the card’s BIN (Bank Identification Number). Some issuers include locale metadata in the BIN data that lets the dispenser infer a preferred language.
The touchscreen displays a language selector at the start of the transaction — typically as a flag icon or “English / Español / Português” prompt.
The site controller routes the customer’s language choice to the firmware audio player, which plays the correct language track for each subsequent prompt step.
If no choice is made within a timeout window, the dispenser defaults to English (US standard) or the operator-configured default.

Producing a three-language prompt set — English, Spanish, Portuguese — means three versions of every prompt in the table above, approximately 48–60 audio files total, all generated from the same master script with appropriate translations.

Audio Engineering for Outdoor Fuel Dispenser Speakers

The acoustic environment at a gas station forecourt is hostile to speech clarity. Ambient noise includes:

Traffic noise: 65–80 dB SPL on a busy arterial road
Canopy echo: the metal or fiberglass overhead canopy creates early reflections that muddy consonants
Wind: gusts at 10–20 mph add broadband noise directly over the microphone-equivalent listening position
Engine noise: customer vehicles idling at 50–60 dB

The dispenser speaker is typically a small full-range cone driver (3–4 inches) in a sealed plastic housing, rated at 5–10 W RMS. The frequency response peaks around 1–3 kHz and rolls off sharply below 200 Hz and above 8 kHz. A voice that sounds natural and warm on studio monitors sounds thin and reedy through this hardware in an outdoor ambient of 70 dB.

Optimizing AI voice audio for outdoor fuel dispenser speakers requires the same EQ treatment as other outdoor public-address systems:

Step 1 — High-pass filter at 200 Hz

The dispenser speaker cannot reproduce meaningful bass below 200 Hz. Any energy below that creates distortion inside the housing rather than audible sound. Apply a 24 dB/octave Butterworth high-pass at 180–200 Hz to the generated audio before export.

Step 2 — Presence boost at 2–4 kHz

The 2–4 kHz band is where the most important speech consonants live — “s,” “t,” “f,” “k” distinction. Boosting this range by +2 to +3 dB shelf or bell significantly improves intelligibility in ambient noise without making the voice harsh through headphones.

Step 3 — Peak normalization

Target peaks at -3 dBFS with a limiter at -1 dBFS. Dispenser audio players typically use fixed gain levels in firmware. Consistent peak levels across all audio files prevent some prompts from playing noticeably louder or softer than others during a transaction — a disorienting experience for customers.

Step 4 — Export format

Legacy Gilbarco Veeder-Root and Wayne hardware: WAV PCM 16-bit, 16 kHz (or 8 kHz for oldest units). New generation hardware: WAV PCM 16-bit or 24-bit, 44.1 kHz. Always confirm the target hardware spec with the site controller documentation before finalizing the export.

These processing steps are identical in principle to the outdoor speaker optimization needed for EV charging station voice prompts and parking garage PA systems — the acoustic constraints are consistent across outdoor public-address applications.

Producing Gas Pump Voice Prompts: Step-by-Step Workflow

Whether you are producing prompts for a single-site operator or a 500-location chain, the workflow follows the same structure:

1. Build the master script

Create a document with every prompt organized by prompt ID, trigger state, script text, language, and notes. The reference table earlier in this article is a starting point. Add or remove prompts based on your dispenser’s feature set — not all dispensers support car wash upsells or contactless payment, for example.

For bilingual deployments, add a column per language. Keep all translations in the same row so you can verify prompt-by-prompt parity across languages.

2. Choose a consistent voice profile

Select a single voice model and apply the same speaking rate and pause parameters throughout. The voice character should match the network brand guidelines — neutral and functional for most fuel retail brands, slightly warmer for premium or boutique fuel brands. Avoid voices with strong regional accents unless the deployment market specifically calls for one.

A comfortable speaking rate for fuel pump prompts is 130–145 words per minute. Faster than that and customers cannot follow instructions at the pump; slower than that and the prompts feel condescending.

3. Generate and apply the outdoor processing chain

Generate each prompt, then apply the EQ processing chain described above: high-pass at 180–200 Hz, presence boost at 2–4 kHz, peak normalization to -3 dBFS, limiter at -1 dBFS. Export in the format required by the target hardware.

4. QA in outdoor conditions

Test the exported files through a speaker that approximates the dispenser hardware in an outdoor ambient setting. A portable Bluetooth speaker at arm’s distance in a parking lot on a sunny afternoon is a reasonable approximation. If consonants are indistinct or the voice gets lost in ambient noise, revisit the presence boost and the speaking rate.

5. Version control and maintenance

Fuel pump prompts require ongoing maintenance. Payment method evolution (adding tap-to-pay, mobile wallet prompts), brand refresh campaigns (new greeting scripts), and regulatory changes (updated card security prompts) all require re-generating specific files. An AI voice generator makes this fast: update the script, regenerate the affected files, apply the processing chain, push to the site controller.

Gas Pump AI Voice and Payment Security Prompts

One category of fuel pump prompts deserves special attention: payment security messaging. EMV chip migration, contactless payment adoption, and card skimming prevention campaigns have generated a new set of prompts that most legacy pump audio sets do not include.

Current payment security prompts include:

“This pump is EMV chip-enabled. Please insert your chip card.”
“Do not tap your card until the screen shows the contactless symbol.”
“This pump has been inspected for skimming devices. If you see anything suspicious, please see the cashier.”
“For your security, your card has been encrypted end-to-end.”

These prompts are often required by the card network (Visa, Mastercard) or the acquiring bank as a condition of the dispenser’s EMV certification. They must be accurate, legally vetted, and consistent with the specific EMV certification level the hardware holds. AI voice generation lets operators produce and update these prompts quickly when certification requirements change.

VoxBooster for Fuel Pump Audio Production

VoxBooster’s AI voice engine handles the systematic, high-volume audio production that fuel pump deployments require. You script each prompt, choose from a range of voice profiles, generate the audio, and export in the WAV format your hardware requires. For multilingual deployments — English, Spanish, and Brazilian Portuguese as the common North American trifecta — VoxBooster produces all language variants from the same script library without switching tools.

The AI voice generation workflow also supports producing audio at different sample rates from the same session, which is useful when a fleet has mixed-generation hardware and requires both 16 kHz files for legacy units and 44.1 kHz files for newer dispensers.

For related AI voice production contexts that share the outdoor public-address audio engineering requirements, see our guides on AI voice for EV charging stations and AI voice for parking garage PA systems. For voice production in other self-service retail settings, AI voice for self-checkout kiosks covers similar hardware constraints and accessibility requirements. If you are building a broader voice content library, voice cloning for voiceover production and AI voice tools for content creators cover overlapping workflows.

Accessibility and Compliance at the Fuel Dispenser

ADA compliance for fuel dispensers has specific audio requirements. The Americans with Disabilities Act Technical Standards for accessible transactions require that automated teller functions — which includes pay-at-pump credit card transactions — be accessible to customers with visual impairments. This means:

Audio prompts must be available for each step of the transaction without requiring the customer to select an accessibility mode first.
The audio must be playable through a standard 3.5 mm headphone jack on the dispenser (required for customers using assistive listening devices).
Volume must be adjustable by the customer.
Speech must be intelligible at the designated listening position against expected ambient noise.

The headphone jack requirement is significant from a production standpoint: the same audio files that play through the outdoor speaker also play through the headphone output. This means the outdoor EQ treatment (presence boost at 2–4 kHz, high-pass filter) must not make the audio unpleasant through headphones. The solution is to apply a moderate rather than aggressive treatment — +2 dB at 2–4 kHz rather than +4 dB — which improves outdoor intelligibility enough to meet the compliance threshold without being harsh through headphones.

Title III of the ADA and the related FTC Fuel Dispenser Accessibility Technical Standards also specify that prompts must not assume the customer can see the screen. Every instruction that refers to a visual element on the screen must have an audio equivalent that does not rely on the customer seeing the visual. “Please press the green button” is non-compliant; “Please press the button on the left of the screen labeled ENTER” is compliant.

Frequently Asked Questions

What is gas pump voice AI?

Gas pump voice AI is a synthesized text-to-speech system embedded in pay-at-pump fuel dispensers. It plays scripted audio prompts at each transaction step — insert card, select grade, lift nozzle, begin fueling, take receipt — replacing the need for live attendants to verbally guide customers through the payment and fueling sequence.

Who makes the voice on gas station pumps?

The audio on gas station pumps is produced by the dispenser manufacturer or the oil company brand team. Gilbarco Veeder-Root and Wayne Fueling Systems are the two dominant hardware manufacturers in North America. Their dispenser firmware plays WAV or MP3 audio files that operators load into the unit, generated by AI voice tools or recorded voice actors depending on the deployment era.

Can an AI voice generator create gas pump prompts?

Yes. A modern AI voice generator lets you script the full pay-pump transaction sequence — authentication, grade selection, fueling start/stop, receipt options — and export audio files in the WAV format most dispenser firmware requires. You can produce the same script in English, Spanish, Portuguese, or other languages from one workflow, without hiring separate voice talent per language.

What audio prompts does a gas pump need?

A complete fuel dispenser audio set covers: welcome greeting, card insert or tap-to-pay prompt, PIN entry instruction, car wash upsell, grade selection (regular, plus, premium, diesel), lift nozzle instruction, begin fueling confirmation, fueling in progress (optional), fueling complete, receipt offer, thank-you close, and error messages for declined cards, pump faults, and nozzle detection failures.

How do multilingual gas pump prompts work?

Multilingual fuel dispensers detect the customer’s language preference from the payment terminal locale, the network operator back-end, or a touchscreen language selector at the start of the transaction. The dispenser firmware then plays the corresponding audio file for each prompt step. AI voice generators produce the full prompt set in each required language from the same master scripts.

What audio format do gas pump dispensers use?

Most Gilbarco Veeder-Root and Wayne Fueling Systems dispensers accept WAV files at PCM 16-bit, 8 kHz or 16 kHz sample rate — a legacy spec driven by the embedded hardware in older units. Newer dispenser platforms support 44.1 kHz PCM. Always check the site controller documentation for your specific hardware before producing the final export.

How do Shell, BP, and Chevron handle pump voice branding?

Shell, BP, Chevron, and Petrobras each maintain brand audio guidelines that specify voice tone, pace, and greeting language for their networks. Company-owned sites follow those brand standards closely; independently owned franchise sites often use the dispenser manufacturer’s default prompts. AI voice generators let branded networks produce consistent, on-brand audio across thousands of sites without a re-recording session for each script update.

Conclusion

Gas pump voice AI is not glamorous infrastructure, but it is infrastructure that handles millions of customer interactions every day across fuel retail networks built on Gilbarco Veeder-Root and Wayne Fueling Systems dispensers. Getting the prompts right — clear, consistently voiced, multilingual where required, ADA-compliant, matched to the speaker hardware’s acoustic limits — is the difference between a transaction that completes smoothly and one that ends with the customer walking inside to ask a cashier what the pump just said.

Shell, BP, Chevron, and Petrobras have each invested in brand audio guidelines because they understand that the pump voice is a brand touchpoint at every fueling transaction. The production requirement is systematic: build the master script, generate with a consistent AI voice, apply the outdoor processing chain, and maintain a version-controlled audio library that can be updated when payment methods, brand messaging, or compliance requirements change.

If you are producing fuel pump audio — whether for a single-operator site or a multi-network fleet — VoxBooster provides the AI voice generation tools to build and maintain the complete prompt set. The 3-day free trial lets you generate a sample transaction sequence and export it in the WAV format your hardware requires before committing to a full production run.