×

Me, Myself, and (A)I: Voice, Attention, and the Sonic Ad Economy

In her latest ExchangeWire column, Shirley Marschall charts the evolution of audio from ringtones from a loud blue frog, to a potential future of algorithmically curated AI assistants.

Sound is an interesting creature, trapped somewhere between “avoid at all costs” and “anything but silence.”

And in the midst of our muted culture, AI voice, Google Zero, AI overviews, podcasts, and TikTok are dragging sound back to the center of attention. A true sonic jungle of noise, voice (AI), sound, and audio.

Ring ding ding ding ding a ring ding ding…That was the Crazy Frog, circa early 2000s. One of the most downloaded ringtones in history, and more likely than not, the reason mobile phones ringing in public became a global faux pas.

And remember when we used to talk on the phone? Actually answer calls? Speak to humans? Luckily, the mailbox saved us, at least for a while, until it started telling us not to leave voicemails but to text instead. Voice notes came next. Great for the lazy typers, less so for the unlucky listener trying to find the right “environment” to hit play.

We traded ringtones for vibrations, calls for texts… and now suddenly we’re all excited because GenAI has voice mode.

The only thing more confusing than AI’s pace of change might be… people.

But here we are, back in audio land.

Audio’s second chance

Why? Because Google Zero, Voice AI, TikTok, podcasts, and content overload are pushing publishers and creators to reimagine storytelling. Written articles might no longer be enough. Now, everyone wants sound and motion, even if it means letting AI do the talking.

For years, audio advertising and podcast monetisation lived in that familiar “up-and-coming” limbo. Always full of potential, rarely prioritised. But the rise of voice interfaces and screenless experiences is changing the tone, quite literally.

And where attention goes, budgets slowly follow and audio is quietly getting loud again… a sonic boom.

Why? Because listening feels more intentional than scrolling. Because sound makes people feel things in a way a banner never will. And because brand recall and emotional connection are measurably stronger with audio.

And smart speakers and AI companions are opening new frontiers for branded voice interactions - think dynamic prompts, ambient nudges, even full-on synthetic cohosts.

In a world drowning in visual clutter, sound is the white space. It’s intimate, immersive, and surprisingly sticky.

The tech that talks back

Like most things in advertising, this isn’t exactly new.

Remember when voice was going to revolutionise shopping?

That future already had a name: Alexa. Just say it “Alexa, order more toothpaste” and your household would be magically restocked. That was the vision of frictionless voice commerce. Except… it never really happened and Amazon quietly pulled back. 

And Alexa? Mostly used for weather, kitchen timers, and dimming the lights.

But now, GenAI is giving voice another shot.

ChatGPT can speak to you in real time. Google Gemini wants to be your life coach, your co-pilot, your podcast guest. And of course Meta’s getting in on the action, because apparently everyone needs an AI-generated “friend” whispering sweet prompts in their ear.

And to be fair, it’s impressive. The latency is low, the tone is warm, the cadence sounds… almost human.

Sonic (re)branding

“What does your brand sound like?” Yeah, that conversation is back as well but with a (sonic) rebrand.

Sonic branding isn’t just a jingle anymore. It’s a chatbot’s voice, a product’s audio signature or the rhythm of a prompt. It’s the “soundtrack" of a brand experience in an AI-mediated world.

And if text gets rephrased, visuals get AI-generated, and influencers become avatars, voice still carries something real (at least for now). 

The industry, predictably, reacts.

Retailers are already rolling out voice-led shopping journeys, guiding users step by step like a sales assistant that never needs a break. Wellness apps are going full audio-first, AI therapists included. Brands are scaling with synthetic narrators, producing “human” content with zero human cost. And advertisers? They’re paying attention to the data. Audio consistently scores higher on attention, retention, and emotional impact.

Voice versus noise

The line between voice and noise is a really thin one, though. Yes, audio is on the rise but so are noise-cancelling headphones. 

It almost seems like everyone wants to talk but no one wants to listen. (To other people talking). 

You = noise. Me = voice.

Your voice = my noise. My voice = your noise.

People talking = noise. People podcasting = voice. 

Talking to AI = voice. Hearing someone else talk to AI = noise. 

Where on the noise versus voice scale will digital audio ads, AI voice, and the likes end up? Will noise cancelling of the future auto-filter noise/voice based on personal preferences? Adjust the filter based on user behaviour and potentially turn into sonic ad blockers?

Will we curate our soundscapes as carefully as we curate our screens?

The algorithmically curated soundscape…

…AR glasses on, noise-cancelling earbuds in, AI companion whispering context-aware nudges. Conversations auto-unmute when proximity is detected. Meetings are recorded and replayed. Voice synthesis bots do the talking, while AI scribes do the remembering. Ad impressions don’t blink or flash but they speak. Softly, personally, basically branded whispers, wrapped in context.

An audio-rich, immersive, personalised and entirely algorithmically curated world.

Just: Me, myself and (A)I