back
Get SIGNAL/NOISE in your inbox daily

Multimodal AI: Expanding Vertical AI’s Impact: The emergence of multimodal models capable of processing audio, video, voice, and vision data is creating new opportunities for vertical AI applications to transform a wider range of industries and workflows.

Key advancements in multimodal architecture:

  • Recent models have demonstrated improved context understanding, reduced hallucinations, and enhanced reasoning capabilities.
  • Performance in speech recognition, image processing, and voice generation is approaching or surpassing human capabilities in some cases.
  • New speech-native models, like OpenAI’s Realtime API and Kyutai’s Moshi, are replacing cascading architecture with lower latency and better context capture.

Voice capabilities and use cases:

  • Transcription applications are freeing up time for professionals in various fields:

    • Abridge’s medical transcription tool generates notes and identifies follow-ups from clinical conversations.
    • Rillavoice records and transcribes conversations for sales training in the home services industry.
  • End-to-end voice agents are showing promise in multiple areas:

    • Inbound sales: Fielding customer calls after hours and booking appointments.
    • Customer support: Providing more effective responses than traditional IVR systems.
    • Outbound calls: Automating initial contact for sales and recruiting teams.

Vision capabilities and applications:

  • Models like GPT-4V and Gemini 1.5 Pro can interpret images, respond to questions, and process raw images and video.
  • Key use cases include:
    • Data extraction from unstructured documents (e.g., Raft’s platform for freight forwarding).
    • Visual inspection augmentation (e.g., xBuild’s AI construction platform).
    • 2D and 3D design generation (e.g., Snaptrude’s 3D building design tool).
    • Video analytics for safety monitoring and object tracking.

The rise of AI agents:

  • Progress in constraining tasks for AI agents has led to reduced errors in multi-step reasoning.
  • Reasoning-focused foundation models like OpenAI’s o1 are showing promise in complex problem-solving.
  • Current applications include:
    • Sales and marketing: Researching prospects and crafting personalized outreach.
    • Negotiations: Automating legal and commercial term negotiations.
    • Investigations: Assisting with initial phases of cybersecurity alert investigations.

Broader implications for vertical AI:

  • Multimodal capabilities are expanding the potential impact of vertical AI across industries.
  • As underlying models become commoditized, it will be more sustainable for companies to build applications on top of powerful foundation models.
  • The integration of these new capabilities is expected to fundamentally change how we work and interact with the world.

Looking ahead: The next wave of vertical AI applications will likely focus on addressing complex workflows autonomously, leveraging advancements in reasoning-based models. As the technology continues to evolve, we can expect to see novel business models emerge, including copilots, agents, and AI-enabled services, opening up new opportunities in previously untapped industries.

Recent Stories

Oct 17, 2025

DOE fusion roadmap targets 2030s commercial deployment as AI drives $9B investment

The Department of Energy has released a new roadmap targeting commercial-scale fusion power deployment by the mid-2030s, though the plan lacks specific funding commitments and relies on scientific breakthroughs that have eluded researchers for decades. The strategy emphasizes public-private partnerships and positions AI as both a research tool and motivation for developing fusion energy to meet data centers' growing electricity demands. The big picture: The DOE's roadmap aims to "deliver the public infrastructure that supports the fusion private sector scale up in the 2030s," but acknowledges it cannot commit to specific funding levels and remains subject to Congressional appropriations. Why...

Oct 17, 2025

Tying it all together: Credo’s purple cables power the $4B AI data center boom

Credo, a Silicon Valley semiconductor company specializing in data center cables and chips, has seen its stock price more than double this year to $143.61, following a 245% surge in 2024. The company's signature purple cables, which cost between $300-$500 each, have become essential infrastructure for AI data centers, positioning Credo to capitalize on the trillion-dollar AI infrastructure expansion as hyperscalers like Amazon, Microsoft, and Elon Musk's xAI rapidly build out massive computing facilities. What you should know: Credo's active electrical cables (AECs) are becoming indispensable for connecting the massive GPU clusters required for AI training and inference. The company...

Oct 17, 2025

Vatican launches Latin American AI network for human development

The Vatican hosted a two-day conference bringing together 50 global experts to explore how artificial intelligence can advance peace, social justice, and human development. The event launched the Latin American AI Network for Integral Human Development and established principles for ethical AI governance that prioritize human dignity over technological advancement. What you should know: The Pontifical Academy of Social Sciences, the Vatican's research body for social issues, organized the "Digital Rerum Novarum" conference on October 16-17, combining academic research with practical AI applications. Participants included leading experts from MIT, Microsoft, Columbia University, the UN, and major European institutions. The conference...