News

Introducing MoisesDB: The Ultimate Multitrack Dataset for Source Separation Beyond 4-Stems

Moises.ai has launched MoisesDB, a comprehensive, publicly available multitrack audio dataset. It includes 240 unreleased tracks from 47 artists and aims to enhance source separation capabilities.

image

Hello, music and AI enthusiasts! We are excited to announce the release of MoisesDB, the largest publicly available set of multitrack audio recordings for source separation beyond 4-stems. This comprehensive dataset comprises 240 previously unreleased songs created by 47 artists that span twelve high-level genres. The total duration of the dataset is 14 hours, 24 minutes and 46 seconds, where the average recording is 3:36 seconds, with a standard deviation of 66 seconds. The organizational structure of the dataset follows a taxonomy that reflects the needs of source separation systems. The large number of songs, the diverse types of stems and tracks, and their organization in a source-separation-focused taxonomy will allow researchers to build their own stems according to their own requirements, and thus develop more granular source separation systems.

To facilitate the use of this dataset, we provide an easy-to-use Python library to access the data moises-db, allowing fast integration with machine learning libraries. Moreover, we include performance results for two publicly available source separation methods: HT-Demucs, which has the best overall SDR score evaluated on the MUSDB18 test set, and Spleeter, which was one of the first source separation models released and adopted by the general public. We also added results for a few masking-based oracle methods: IBM, IRM, and MWF, which indicate the theoretical performance limits for mask-based source separation models.

MoisesDB is offered free of charge for non-commercial research use only. We hope that this dataset will prove to be a great resource for the source separation community in the future. It aims to facilitate the development of better and extended source separation models and provide opportunities to be applied for other use cases, such as automatic mixing and generative accompaniment systems, among others.

For more information about the dataset, readers can refer to our publication at ISMIR23.

@misc{pereira2023moisesdb,
  title={Moisesdb: A dataset for source separation beyond 4-stems}, 
  author={Igor Pereira and Felipe Araújo and Filip Korzeniowski and Richard Vogl},
  year={2023},
  eprint={2307.15913},
  archivePrefix={arXiv},
  primaryClass={cs.SD}
}

Our Commitment to the MIR Community

At Moises.ai, we are committed to fostering innovation and progress in the MIR community. By providing researchers with the right tools and resources, we can help accelerate the development of better solutions and technologies in music information retrieval. MoisesDB is a testament to this commitment.

Download the Dataset

Future Plans for MusicAI DB

The launch of MoisesDB is just the beginning. We plan to continuously expand and enrich this dataset with more source-separated tracks, covering a wider range of genres, and styles. Our goal is to make MoisesDB the most comprehensive and diverse dataset for MIR research, eventually replacing MUSDB as the de facto research and benchmarking dataset for source separation.

We invite researchers around the globe to explore MoisesDB, leverage its potential, and contribute to the advancement of MIR research. Together, we can push the boundaries of what's possible in music information retrieval and create better solutions for the future.

To download the MoisesDB dataset, please visit our website at download area. We look forward to seeing the innovative ways in which this dataset will be used to advance the field of source separation.

Stay tuned for more updates on MoisesDB and our ongoing efforts to support and advance MIR research. We are just getting started, and we are excited about the journey ahead.

Let's make music research better, together!