the name: short for Foundational Generative Audio Transformer Opus 1, no one knows whether it is an acronym or a backronym, but that is besides the point.

Up to now, all we have is a paper, and some sample clips, and a youtube video, but boy does it look great

Fugatto supports all of the following features out of the box

  • Emergent properties
  • Large-scale data
  • Supports numerous tasks
  • Free-form Instructions
  • Open ended generation
  • Compositionality (ComposableART)
  • Multi-Modal Inputs

In comparison to other popular AIs that tick one or two of the boxes, this is an impressive software that ticks all of the above

One of the features that is different than anything else is what nvidia calls the avocado chair, when objects that traditionally make one sound make a sound of something else ! so when you say he makes the guitar speak, this time you might actually mean it literally !

The main competitors in Audio are

AudioBox
NExT-GPT
UniAudio
Audit
VoiceLDM

Services

Suno AI

nvidia says it has trained the model (of 2.5 billion parameters) using 32 H100 Tensor core GPUs (DGX)

More about fugatto

https://research.nvidia.com/publication/2024-11_fugatto-1-foundational-generative-audio-transformer-opus-1

https://blogs.nvidia.com/blog/fugatto-gen-ai-sound-model


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *