×
Perfectly imperfect: AI voice companions evolve beyond ChatGPT with unsettling realism
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

A new conversational AI called Sesame is raising eyebrows with its uncannily human-like speech patterns, complete with hesitations, self-corrections, and natural interruptions. Unlike traditional AI assistants that simply convert text to speech, Sesame’s breakthrough Conversational Speech Model (CSM) generates speech in a way that mirrors authentic human conversation, potentially marking a significant shift in how we interact with AI systems.

The big picture: Sesame represents a departure from conventional AI voice assistants by deliberately incorporating human imperfections rather than striving for polished perfection.

How it works: Sesame’s Conversational Speech Model combines text and audio processing into a single unified system, enabling more natural speech generation.

  • Unlike ChatGPT and Gemini, which first generate text and then convert it to speech, Sesame creates speech directly with human-like pauses, tonal shifts, and filler words.
  • The system can interrupt conversations, apologize for interruptions, and even change its “mind” mid-sentence, mirroring natural human speech patterns.

Key features: The AI demonstrates sophisticated conversational abilities that go beyond traditional voice assistants.

  • It produces natural chuckles when saying something mildly amusing.
  • The system incorporates thoughtful pauses before responding to questions.
  • It seamlessly handles interruptions in both directions, creating more authentic dialogue.

Why this matters: Sesame’s ability to replicate human speech imperfections so accurately raises important questions about the future of AI-human interactions and the increasing difficulty of distinguishing between human and AI voices.

Behind the numbers: While Sesame currently remains a niche technology, its development suggests a future where phone conversations may require verification of whether the speaker is human or AI.

I tried the most realistic AI voice companion ever created - if ChatGPT or Gemini ever gets this good, reality is in trouble

Recent News

Meta pursued Perplexity acquisition before $14.3B Scale AI deal

Meta's AI talent hunt includes $100 million signing bonuses to lure OpenAI employees.

7 essential strategies for safe AI implementation in construction

Without a defensible trail, AI-assisted decisions become nearly impossible to justify in court.