README.bootstrap.md
142 Bytes
modelscope_music bootstrap
- Fill raw audio files under
raw/ - Review license before training
- Convert to final catalog/query manifests
Broaden external dataset bootstrap support and replace naive hard-case oversampling with a more targeted weighting signal that measurably helps humming-like queries while preserving the release/eval workflow. Constraint: Hard-case optimization must be evidence-driven and preserve a record of mixed outcomes across iterations Rejected: Reuse naive oversampling after regression | it already showed worse overall behavior with no hard-case gain Confidence: medium Scope-risk: moderate Directive: Next iteration should target confused-case negatives explicitly; do not assume humming gains transfer to confusion robustness Tested: bootstrap generation for MTG-Jamendo and ModelScope placeholders; 2-epoch CPU training for models_v5; index_v5 build; fast eval JSON generation for smoke-v5 Not-tested: real audio ingestion for the new datasets; full melody-aware slow evaluation on models_v5cnb.bofCdSsphPA authored