Improve confused-case retrieval with sample-level hard weighting
Constraint: Must preserve runnable pipeline and record stage evidence before continuing optimization Rejected: More naive oversampling | Regressed overall and hard-case accuracy in smoke-v4 Confidence: medium Scope-risk: moderate Directive: Treat confused and humming_like as separate optimization lanes in future stages Tested: /usr/local/miniconda3/bin/python train.py --data data/synthetic_v2 --output data/models_v6 --device cpu --epochs 1 --batch-size 6 --dry-run; /usr/local/miniconda3/bin/python -m py_compile train.py src/models/losses.py src/data/dataset.py; /usr/local/miniconda3/bin/python train.py --data data/synthetic_v2 --output data/models_v6 --device cpu --epochs 2 --batch-size 6; /usr/local/miniconda3/bin/python run_demo.py build-index --data data/synthetic_v2 --model data/models_v6/best_model.pt --output data/index_v6 --device cpu; /usr/local/miniconda3/bin/python evaluate.py --data data/synthetic_v2 --model data/models_v6/best_model.pt --index-prefix data/index_v6/reference --split test --device cpu --fast-eval --output-json reports/smoke-v6/synthetic_v2/eval.json; /usr/local/miniconda3/bin/python scripts/generate_artifacts.py --eval-json reports/smoke-v6/synthetic_v2/eval.json --config-json reports/smoke-v6/synthetic_v2/config.json --output-dir reports/smoke-v6/synthetic_v2 --model-version smoke-v6 --data-version synthetic_v2 Not-tested: Real external dataset training run and GPU-scale convergence
Showing
17 changed files
with
321 additions
and
7 deletions
acr-engine/data/index_v6/chromaprint.pkl
0 → 100644
No preview for this file type
acr-engine/data/index_v6/reference_embs.npy
0 → 100644
No preview for this file type
acr-engine/data/index_v6/reference_ids.npy
0 → 100644
No preview for this file type
acr-engine/data/models_v6/best_model.pt
0 → 100644
This file is too large to display.
acr-engine/data/models_v6/song_to_idx.json
0 → 100644
-
Please register or sign in to post a comment