acr-engine/data/index_v6/reference_ids.npy · 65cc45c2b3911f019c3ee4a6421a5e9014853005 · wanghai-tech / hikoon-ACR · GitLab

Improve confused-case retrieval with sample-level hard weighting · c89ef4f9 ...

Constraint: Must preserve runnable pipeline and record stage evidence before continuing optimization
Rejected: More naive oversampling | Regressed overall and hard-case accuracy in smoke-v4
Confidence: medium
Scope-risk: moderate
Directive: Treat confused and humming_like as separate optimization lanes in future stages
Tested: /usr/local/miniconda3/bin/python train.py --data data/synthetic_v2 --output data/models_v6 --device cpu --epochs 1 --batch-size 6 --dry-run; /usr/local/miniconda3/bin/python -m py_compile train.py src/models/losses.py src/data/dataset.py; /usr/local/miniconda3/bin/python train.py --data data/synthetic_v2 --output data/models_v6 --device cpu --epochs 2 --batch-size 6; /usr/local/miniconda3/bin/python run_demo.py build-index --data data/synthetic_v2 --model data/models_v6/best_model.pt --output data/index_v6 --device cpu; /usr/local/miniconda3/bin/python evaluate.py --data data/synthetic_v2 --model data/models_v6/best_model.pt --index-prefix data/index_v6/reference --split test --device cpu --fast-eval --output-json reports/smoke-v6/synthetic_v2/eval.json; /usr/local/miniconda3/bin/python scripts/generate_artifacts.py --eval-json reports/smoke-v6/synthetic_v2/eval.json --config-json reports/smoke-v6/synthetic_v2/config.json --output-dir reports/smoke-v6/synthetic_v2 --model-version smoke-v6 --data-version synthetic_v2
Not-tested: Real external dataset training run and GPU-scale convergence

authored 2026-06-02 12:20:42 +0800

reference_ids.npy 4.34 KB

Raw History Permalink

Download (4.34 KB)