Commit a3a5303f a3a5303fed5abd701115708b678a59c00ef599b9 by cnb.bofCdSsphPA

Record the first cap48 seed123 hybrid score for the multi-seed check

Persist the newly finished cap48 seed123 hybrid result so the second-seed validation run now has measured evidence instead of only a runtime checkpoint.

Constraint: seed123 high_energy and the final report are still pending
Rejected: Wait for the full seed123 report before updating docs | Would leave the multi-seed evidence stale across sessions
Confidence: high
Scope-risk: narrow
Directive: Replace the seed123 partial section with the final two-strategy ranking once high_energy eval and report.json land
Tested: Verified /tmp/ab_smoke_seg_cap48_top2_seed123/hybrid/fma_reports_smoke/eval.json; verified docs record hybrid=24/0.9583/1.0 and high_energy still in build-index
Not-tested: Final seed123 comparison because high_energy has not finished yet
1 parent ef7e4493
......@@ -15,10 +15,12 @@
当前 fresh evidence:
- 第二个 seed 已启动
- 当前 first lane 为:
- `hybrid`
- 当前已进入:
- `evaluate.py --data /tmp/ab_smoke_seg_cap48_top2_seed123/hybrid/fma/manifests ... --max-queries 24`
- `hybrid` 已完成首条评测:
- `num_queries = 24`
- `top1 = 0.9583`
- `topk = 1.0`
- `high_energy` 已进入:
- `run_demo.py build-index --resume --checkpoint-every-refs 100`
结论:
- 已经从“单轮 cap48 反转”升级为“开始做多 seed 复核”
......
......@@ -534,8 +534,8 @@ cd /workspace/acr-engine
| `subset_size` | `48` |
| `max_test_queries` | `24` |
| `seed` | `123` |
| 首个运行策略 | `hybrid` |
| 当前阶段 | `evaluate.py --max-queries 24` |
| `hybrid` | `num_queries=24`, `top1=0.9583`, `topk=1.0` |
| `high_energy` | `run_demo.py build-index --resume --checkpoint-every-refs 100` |
| `report.json` | 尚未生成 |
恢复检查命令:
......