Add external dataset bootstrap and record hard-case oversampling regression
Extend the data ingress path with bootstrap manifests for real datasets and capture an unsuccessful hard-case oversampling experiment so future iterations can avoid repeating the same weak strategy. Constraint: Continuous optimization requires preserving negative results, not just successful ones Rejected: Drop the oversampling attempt without record | would lose evidence and encourage redoing the same low-yield change Confidence: high Scope-risk: moderate Directive: Next hard-case work should focus on melody-aware supervision and harder negatives instead of naive sample repetition Tested: bootstrap manifest generation for FMA and CCMusic; 2-epoch CPU training for models_v4; index_v4 build; fast eval JSON generation for smoke-v4 Not-tested: whitelisted real audio ingestion beyond placeholder manifests; full melody-aware slow-eval on models_v4
Showing
22 changed files
with
248 additions
and
2 deletions
acr-engine/data/index_v4/chromaprint.pkl
0 → 100644
No preview for this file type
acr-engine/data/index_v4/reference_embs.npy
0 → 100644
No preview for this file type
acr-engine/data/index_v4/reference_ids.npy
0 → 100644
No preview for this file type
acr-engine/data/models_v4/best_model.pt
0 → 100644
This file is too large to display.
acr-engine/data/models_v4/song_to_idx.json
0 → 100644
No preview for this file type
No preview for this file type
acr-engine/src/data/bootstrap_external.py
0 → 100755
No preview for this file type
No preview for this file type
No preview for this file type
No preview for this file type
No preview for this file type
No preview for this file type
No preview for this file type
-
Please register or sign in to post a comment