Preserve proof that cap64 advanced into evaluation before results landed
Constraint: The cap64 run is still active, so this checkpoint can only record stage completion evidence rather than final benchmark conclusions Rejected: Wait for eval.json or report.json before committing | Would lose the verified handoff that indexing finished and evaluate.py is now running Confidence: high Scope-risk: narrow Directive: Keep stage checkpoints explicit—training complete, index complete, evaluation running, report complete—until cap64 fully settles Tested: Verified reference_progress.json shows 64 refs, 657 windows, and complete status; verified active process is evaluate.py on /tmp/ab_smoke_seg_cap64_top2/high_energy/fma/manifests; verified high_energy eval.json and report.json are still absent Not-tested: Final cap64 high_energy metrics, hybrid branch execution, and post-cap64 strategy guidance
Showing
4 changed files
with
23 additions
and
3 deletions
| 1 | ## 2026-06-02 cap64 索引完成并进入评测 checkpoint | ||
| 2 | |||
| 3 | 完成项: | ||
| 4 | - 已确认 cap64 的 `high_energy` reference index 构建完成。 | ||
| 5 | - 已确认流程从 `build-index` 推进到 `evaluate.py`。 | ||
| 6 | |||
| 7 | 验证证据: | ||
| 8 | - `reference_progress.json`: | ||
| 9 | - `status=complete` | ||
| 10 | - `refs_done=64` | ||
| 11 | - `windows_done=657` | ||
| 12 | - `embedding_shape=[657, 192]` | ||
| 13 | - `elapsed_sec=108.986` | ||
| 14 | - 进程树显示: | ||
| 15 | - `evaluate.py --data /tmp/ab_smoke_seg_cap64_top2/high_energy/fma/manifests ... --seed 42 --max-queries 32` | ||
| 16 | - 截至本 checkpoint: | ||
| 17 | - `report.json` 仍未生成 | ||
| 18 | |||
| 1 | ## 2026-06-02 cap64 训练完成证据 checkpoint | 19 | ## 2026-06-02 cap64 训练完成证据 checkpoint |
| 2 | 20 | ||
| 3 | 完成项: | 21 | 完成项: | ... | ... |
| ... | @@ -66,3 +66,5 @@ cd /workspace/acr-engine | ... | @@ -66,3 +66,5 @@ cd /workspace/acr-engine |
| 66 | - 已补齐 `seed=999` 最终结果,并完成 cap48 三 seed aggregate 归纳。 | 66 | - 已补齐 `seed=999` 最终结果,并完成 cap48 三 seed aggregate 归纳。 |
| 67 | 67 | ||
| 68 | - 已记录 cap64 benchmark 已启动,并确认进入 `high_energy` 训练阶段。 | 68 | - 已记录 cap64 benchmark 已启动,并确认进入 `high_energy` 训练阶段。 |
| 69 | |||
| 70 | - 已补充 cap64 新鲜证据:`high_energy` 索引完成(`64 refs / 657 windows / 192-d`)并进入 `evaluate.py`。 | ... | ... |
| ... | @@ -60,5 +60,5 @@ test -f /tmp/ab_smoke_seg_cap48_top2_seed999/report.json && cat /tmp/ab_smoke_se | ... | @@ -60,5 +60,5 @@ test -f /tmp/ab_smoke_seg_cap48_top2_seed999/report.json && cat /tmp/ab_smoke_se |
| 60 | ## 下一轮已启动 | 60 | ## 下一轮已启动 |
| 61 | 61 | ||
| 62 | - 新 benchmark:`/tmp/ab_smoke_seg_cap64_top2` | 62 | - 新 benchmark:`/tmp/ab_smoke_seg_cap64_top2` |
| 63 | - 当前阶段:`high_energy` 训练已完成,现处于 build-index 中 | 63 | - 当前阶段:`high_energy` 索引已完成,现处于 evaluate 中 |
| 64 | - 下一 session 应优先检查 `report.json` 是否生成 | 64 | - 下一 session 应优先检查 `report.json` 是否生成 | ... | ... |
| ... | @@ -675,6 +675,6 @@ seed123 最终结论: | ... | @@ -675,6 +675,6 @@ seed123 最终结论: |
| 675 | - 已启动:`/tmp/ab_smoke_seg_cap64_top2` | 675 | - 已启动:`/tmp/ab_smoke_seg_cap64_top2` |
| 676 | - 配置:`subset_size=64`, `max_test_queries=32`, `seed=42` | 676 | - 配置:`subset_size=64`, `max_test_queries=32`, `seed=42` |
| 677 | - 当前最新证据: | 677 | - 当前最新证据: |
| 678 | - 已从运行会话确认 `high_energy` 的 `Epoch 1/1` 完整跑完(`32/32`) | 678 | - `high_energy` reference index 已完成:`64 refs / 657 windows / 192-d` |
| 679 | - 当前处于 `run_demo.py build-index`,尚未产出最终 `report.json` | 679 | - 当前已进入 `evaluate.py`,尚未产出最终 `report.json` |
| 680 | 680 | ... | ... |
-
Please register or sign in to post a comment