Name Last Update
..
demo_fma_like/manifests Loading commit data...
demo_via_adapter/fma/manifests Loading commit data...
README.md Loading commit data...

External Open-Music Ingestion

Goal

Convert local open-music audio folders into ACR-ready manifests for:

  • training queries
  • evaluation queries
  • reference catalog indexing

Recommended personal-use flow

1. Prepare a local audio directory

Examples:

2. Generate manifests through the adapter entrypoint

Optional pre-check:

/usr/local/miniconda3/bin/python src/data/external_adapters.py inspect-local fma data/raw/fma_small_audio --eval-ratio 0.2 --query-duration 8.0

Batch pre-check across multiple candidate corpora:

/usr/local/miniconda3/bin/python src/data/external_adapters.py inspect-batch fma=data/raw/fma_small_audio mtg_jamendo=data/raw/mtg_jamendo_audio --eval-ratio 0.2 --query-duration 8.0

Then generate manifests:

/usr/local/miniconda3/bin/python src/data/external_adapters.py prepare-local fma data/raw/fma_small_audio --output-root data/external_ingested --eval-ratio 0.2 --query-duration 8.0

or

/usr/local/miniconda3/bin/python src/data/external_adapters.py prepare-local mtg_jamendo data/raw/mtg_jamendo_audio --output-root data/external_ingested --eval-ratio 0.2 --query-duration 8.0

3. Use outputs

Generated files:

Notes

  • Small datasets are automatically protected so both train/test query sets exist.
  • For personal use, FMA and MTG-Jamendo should be the first real baselines.
  • Keep test.json fixed across experiments to compare models fairly.