ElevenLabs bets on voice as AI’s next big interface
As AI moves beyond keyboards and screens, voice could redefine how users interact with machines. ElevenLabs is making a $500 million bet that it’s right.
At Web Summit in Doha, ElevenLabs co-founder and chief executive Mati Staniszewski shared the company's vision: voice-first interaction with intelligent agents that understand context, remember past conversations, and run seamlessly on cloud and device. He’s not alone. Google, OpenAI, and Apple are all racing to embed voice across operating systems, wearables, and virtual platforms.
For marketers and brand builders, this shift isn’t theoretical. Voice tech is rapidly moving into user-facing experiences — from virtual assistants and smart glasses to in-app avatars and interactive ads. The question isn’t if voice will matter, but how fast it will change marketing playbooks.
This article explores how voice is becoming the next major interface for AI, why ElevenLabs is leaning in, and what this shift means for marketers working across emerging formats, hardware, and brand experiences.
Short on time?
Here’s a table of contents for quick access:
- Why ElevenLabs is betting on voice-first AI
- How AI voice models are becoming more agentic
- What marketers should know about voice-based AI

Why ElevenLabs is betting on voice-first AI
ElevenLabs, known for high-quality voice synthesis, raised US$500 million this week at an US$11 billion valuation. The company argues that voice will become the dominant interface for AI as systems grow more contextual, hands-free, and embedded in daily environments.
Speaking to TechCrunch at Web Summit, Staniszewski emphasized that voice tech has surpassed basic speech mimicry. It now integrates with language models to deliver emotionally aware, reasoning-capable interactions. “Hopefully all our phones will go back in our pockets,” he said, envisioning a world where voice commands ambient computing through wearables and virtual environments.
Meta is already a partner, embedding ElevenLabs into Horizon Worlds and Instagram. Staniszewski also expressed interest in collaborations for Meta’s Ray-Ban smart glasses — a key entry point for voice-native experiences in hardware.
How AI voice models are becoming more agentic
The industry is moving from prompt-based models to agentic AI — systems that persistently remember, reason, and respond with less hand-holding. That shift changes what users expect from AI interactions.
As Iconiq Capital partner Seth Pierrepont noted during the summit, traditional inputs like keyboards are starting to feel outdated. Voice enables more natural exchanges, especially when paired with models that manage integrations, memory, and proactive behavior.
For ElevenLabs, this evolution demands a hybrid architecture. Instead of relying solely on cloud infrastructure, the company is building voice systems that also process data locally on devices. This is essential for wearables and always-on audio environments, where latency, battery, and privacy are top concerns.
This model positions voice not as an occasional interface, but a persistent one — operating in the background and adapting to user context over time.
What marketers should know about voice-based AI
The rise of voice AI has big implications for how marketers build campaigns, design customer journeys, and engage audiences. Here’s what to watch:
- Voice UX is the new frontier
Whether it’s smart glasses, virtual worlds, or car interfaces, voice will increasingly become the input layer. Marketers must rethink creative formats, pacing, and tone to match conversational contexts.
- Persistent agents mean longer relationships
Voice-driven agents that remember user history create the opportunity for brands to deliver continuity and personalization across time, not just channels.
- Voice will impact privacy strategies
As voice becomes always-on and embedded in hardware, it will raise fresh questions around consent, data retention, and passive listening. Marketing teams should get ahead of regulatory shifts and consumer sentiment.
- New placements and touchpoints will emerge
Platforms like Horizon Worlds, voice search, and smart assistants will open new surfaces for branded experiences — but they’ll require custom creative and voice-centric strategy.
The takeaway: Voice isn’t just another format. It’s a fundamental shift in how users access and interact with digital ecosystems. Early adopters who learn to design for voice-first AI will have an edge.

