back
Get SIGNAL/NOISE in your inbox daily

The development of Fugatto, NVIDIA’s groundbreaking generative AI sound model, represents a significant advancement in audio technology, offering unprecedented flexibility in creating and manipulating various types of sound using text and audio inputs.

The breakthrough innovation: NVIDIA has introduced Fugatto, a versatile generative AI model that can create and transform any combination of music, voices, and sounds using text prompts and audio file inputs.

  • The model’s name stands for Foundational Generative Audio Transformer Opus 1
  • Multi-platinum producer Ido Zmishlany describes the technology as “wild,” highlighting its potential to create entirely new sounds in real-time
  • The system demonstrates emergent properties, allowing it to perform tasks beyond its initial training

Technical capabilities: Fugatto represents a significant leap forward in audio AI technology, combining multiple sophisticated features into a single system.

  • The model utilizes 2.5 billion parameters and was trained on NVIDIA DGX systems with 32 NVIDIA H100 Tensor Core GPUs
  • It employs ComposableART technology to combine different audio instructions that were originally trained separately
  • Users can control the degree of various effects through temporal interpolation, allowing for nuanced adjustments to accents, emotions, and sound transitions

Practical applications: The technology offers diverse use cases across multiple industries and creative fields.

  • Music producers can quickly prototype songs and experiment with different styles and instruments
  • Advertising agencies can adapt campaigns with various accents and emotional tones
  • Language learning tools can be personalized with familiar voices
  • Video game developers can modify audio assets in real-time based on gameplay

Innovative features: The model breaks new ground in audio manipulation and generation capabilities.

  • Creates novel sound combinations, such as making a trumpet bark or a saxophone meow
  • Generates high-quality singing voices from text prompts with minimal training data
  • Produces dynamic soundscapes that evolve over time, such as transitioning from a thunderstorm to birdsong at dawn
  • Enables fine-grained control over sound attributes and transitions

Development process: The creation of Fugatto involved a diverse international team and sophisticated data preparation methods.

  • The project spanned over a year and included team members from India, Brazil, China, Jordan, and South Korea
  • Researchers developed a complex strategy for generating training data, including millions of audio samples
  • The team’s diverse background contributed to enhanced multi-accent and multilingual capabilities

Looking ahead: The emergence of Fugatto suggests a transformation in how we create and interact with sound, potentially leading to new forms of artistic expression and practical applications across industries, while raising questions about the future relationship between AI and human creativity in audio production.

Recent Stories

Oct 17, 2025

DOE fusion roadmap targets 2030s commercial deployment as AI drives $9B investment

The Department of Energy has released a new roadmap targeting commercial-scale fusion power deployment by the mid-2030s, though the plan lacks specific funding commitments and relies on scientific breakthroughs that have eluded researchers for decades. The strategy emphasizes public-private partnerships and positions AI as both a research tool and motivation for developing fusion energy to meet data centers' growing electricity demands. The big picture: The DOE's roadmap aims to "deliver the public infrastructure that supports the fusion private sector scale up in the 2030s," but acknowledges it cannot commit to specific funding levels and remains subject to Congressional appropriations. Why...

Oct 17, 2025

Tying it all together: Credo’s purple cables power the $4B AI data center boom

Credo, a Silicon Valley semiconductor company specializing in data center cables and chips, has seen its stock price more than double this year to $143.61, following a 245% surge in 2024. The company's signature purple cables, which cost between $300-$500 each, have become essential infrastructure for AI data centers, positioning Credo to capitalize on the trillion-dollar AI infrastructure expansion as hyperscalers like Amazon, Microsoft, and Elon Musk's xAI rapidly build out massive computing facilities. What you should know: Credo's active electrical cables (AECs) are becoming indispensable for connecting the massive GPU clusters required for AI training and inference. The company...

Oct 17, 2025

Vatican launches Latin American AI network for human development

The Vatican hosted a two-day conference bringing together 50 global experts to explore how artificial intelligence can advance peace, social justice, and human development. The event launched the Latin American AI Network for Integral Human Development and established principles for ethical AI governance that prioritize human dignity over technological advancement. What you should know: The Pontifical Academy of Social Sciences, the Vatican's research body for social issues, organized the "Digital Rerum Novarum" conference on October 16-17, combining academic research with practical AI applications. Participants included leading experts from MIT, Microsoft, Columbia University, the UN, and major European institutions. The conference...