Parent Facebook Meta Introduces AudioCraft, AI to Create Music and Audio from Text
Meta launched AudioCraft, an open-source tool based on artificial intelligence (AI) , which makes it easy for users to make music with just text descriptions.
AudioCraft includes three AI models namely MusicGen, AudioGen, and EnCodec. MusicGen is trained with Meta's proprietary and specially licensed music, to generate music from text prompts.
Meanwhile AudioGen, which is trained with public sound effects, also generates audio from prompt text.
Meanwhile, as quoting the official Meta blog Saturday (5/8/2023), EnCodec allows the creation of high quality music with fewer artifacts.
Meta also released their pre-trained AudioGen model, which allows users to generate environmental sounds and sound effects such as a dog barking, a car horn, or footsteps on a wooden floor.
Meta also announced that they are sharing all the weights and code of the AudioCraft model.
"We created this open-source model, giving researchers and practitioners access so they can train their own models with their own datasets for the first time," wrote Meta.
"And helping advance the field of AI-generated audio and music," the Facebook parent company added.
According to Meta, the audio sector is "slightly behind" in terms of Generative AI when compared to images, videos, and text.
The company said, there are jobs that are very complicated and not very open, so people can't play them easily.
"Producing high-fidelity audio of any kind requires modeling complex signals and patterns at multiple scales," said Facebook's parent.
Can Produce High Quality Audio
"Music is arguably the most challenging type of audio to produce as it is made up of local and distant patterns, from tone sets to global musical structures with multiple instruments," says Meta.
Meta claims the AudioCraft family of models can produce high-quality audio with long-term consistency, and can be easily used.
"AudioCraft works for music, sound, compression, and rendering — all in the same place," says Meta.
"Having a solid foundation of open source will drive innovation and complement the way we produce and listen to audio and music in the future," they wrote further.
Regarding copyright, Meta claims that the pre-trained models all use public or company-owned material.
This is not the only text to audio AI tool. Previously, Google also introduced the MusicLM model.
Google Develops MusicLM, Can Turn Text into Music
It is known, Google is developing an experimental AI application that allows users to turn their ideas into music based on text descriptions, namely MusicLM.
Based on the information quoted from Android Police, Thursday (1/6/2023), the concept of MusicLM is not much different from how the Google or Microsoft chatbots work, which respond according to user commands.
Through their official blog, Product Manager of Labs, Kristin Yim, and Product Manager of Google Research, Hema Manickavasagam, announced that registration for MusicLM has started.
This app is said to be an experimental tool that can turn text commands into music. For example, when a user types “Soulful jazz for dinner,” the tool will generate music according to the description.
The blog also explains that MusicLM will send two versions of music and users can choose the one they like more. This is done to help AI models improve their ability to produce music.
However, Google has also anticipated similar concerns with other AI tools, namely abuse that harms artists. Google claims it has worked to make its tools "responsible innovation."
Cooperation with Musicians
To create a tool that fits that vision, Google says the company is working with a number of musicians. "We believe responsible innovation doesn't happen in isolation."
"We've worked with musicians like Dan Deacon and hosted workshops to see how this technology can empower the creative process," the company wrote on its official blog.
To support future research, Google has also released MusicCaps publicly, which is a data set containing about 5 thousand pairs of music and text, with text descriptions provided by experts.
According to Google, MusicLM can help users express their creativity, whether they are beginners or professional musicians.