Commit 026b5539 026b553984497e893a1578370b29a3b4f4bd7f8d by cnb.bofCdSsphPA

Checkpoint the cap48 benchmark while the larger run is still building

Preserve the new 48-track top-two benchmark entry point and current build-index phase so later sessions can continue the expanding validation ladder without rediscovering runtime state.

Constraint: cap48 has not produced scores yet, so only execution-state evidence is available
Rejected: Wait for cap48 scores before recording anything | Risks losing the larger-benchmark checkpoint if the session ends first
Confidence: high
Scope-risk: narrow
Directive: Replace the cap48 running-state section with measured scores once hybrid eval.json or report.json land
Tested: Verified active cap48 processes; verified handoff records work-root, subset size, query cap, and current build-index phase
Not-tested: cap48 strategy scores because the run is still in progress
1 parent f05e7023
...@@ -2,6 +2,28 @@ ...@@ -2,6 +2,28 @@
2 2
3 ## 2026-06-02 3 ## 2026-06-02
4 4
5 ### Stage: 启动 cap48 top2 真实 FMA 对照并记录运行阶段
6
7 完成项:
8 - 启动更大的真实 FMA top2 benchmark:
9 - `work_root = /tmp/ab_smoke_seg_cap48_top2`
10 - `subset_size = 48`
11 - `max_test_queries = 24`
12 - 策略:`hybrid` vs `high_energy`
13 - 更新 [session-handoff.md](./session-handoff.md)
14
15 当前 fresh evidence:
16 - `scripts/ab_smoke_segmentation.py ... --work-root /tmp/ab_smoke_seg_cap48_top2` 已启动
17 - 当前 first lane 为:
18 - `hybrid`
19 - 当前已进入:
20 - `run_demo.py build-index --resume --checkpoint-every-refs 100`
21 - `report.json` 尚未落盘
22
23 结论:
24 - 现在已经开始验证 cap24 / cap32 的结论在更大 `subset=48` 上是否继续成立
25 - 即使当前 session 结束,新 session 也可直接从 handoff 中的 cap48 入口继续盯结果
26
5 ### Stage: 收尾 cap32 top2 真实 FMA 对照并稳定默认策略结论 27 ### Stage: 收尾 cap32 top2 真实 FMA 对照并稳定默认策略结论
6 28
7 完成项: 29 完成项:
......
...@@ -456,6 +456,48 @@ cap32 top2 最终结论: ...@@ -456,6 +456,48 @@ cap32 top2 最终结论:
456 - `hybrid``20 / 0.95 / 1.0` 456 - `hybrid``20 / 0.95 / 1.0`
457 - `high_energy``20 / 0.5 / 1.0` 457 - `high_energy``20 / 0.5 / 1.0`
458 - cap24 与 cap32 两轮更大真实子集都指向同一结论:**默认策略固定为 `hybrid`** 458 - cap24 与 cap32 两轮更大真实子集都指向同一结论:**默认策略固定为 `hybrid`**
459
460 ---
461
462 ## 12. cap48 top2 对照实验(进行中)
463
464 为继续扩展真实数据证据链,已启动更大的 FMA top2 对照:
465
466 ```bash
467 cd /workspace/acr-engine
468 /usr/local/miniconda3/bin/python scripts/ab_smoke_segmentation.py \
469 --dataset fma \
470 --input-dir data/raw/fma_small_audio \
471 --work-root /tmp/ab_smoke_seg_cap48_top2 \
472 --subset-size 48 \
473 --query-duration 8 \
474 --train-epochs 1 \
475 --batch-size 2 \
476 --device cpu \
477 --strategies hybrid high_energy \
478 --max-test-queries 24 \
479 --output-json /tmp/ab_smoke_seg_cap48_top2/report.json
480 ```
481
482 当前 fresh evidence:
483
484 | 项目 | 状态 |
485 |---|---|
486 | `subset_size` | `48` |
487 | `max_test_queries` | `24` |
488 | 首个运行策略 | `hybrid` |
489 | 当前阶段 | `run_demo.py build-index --resume --checkpoint-every-refs 100` |
490 | `report.json` | 尚未生成 |
491
492 恢复检查命令:
493
494 ```bash
495 pgrep -af 'ab_smoke_seg_cap48_top2|external_adapters.py smoke-local fma /tmp/ab_smoke_seg_cap48_top2|evaluate.py --data /tmp/ab_smoke_seg_cap48_top2|run_demo.py build-index --data /tmp/ab_smoke_seg_cap48_top2|train.py --data /tmp/ab_smoke_seg_cap48_top2'
496 ```
497
498 优先等待文件:
499 - `/tmp/ab_smoke_seg_cap48_top2/hybrid/fma_reports_smoke/eval.json`
500 - `/tmp/ab_smoke_seg_cap48_top2/report.json`
459 - `b766c74` Make open-dataset manifests trainable end to end 501 - `b766c74` Make open-dataset manifests trainable end to end
460 - `fa23144` Add a single-page open dataset workflow for training prep 502 - `fa23144` Add a single-page open dataset workflow for training prep
461 - `af33be3` Condense docs and add manifest validation before training 503 - `af33be3` Condense docs and add manifest validation before training
......