Record the proven offline smoke so the handoff reflects executable evidence
Constraint: Limit this checkpoint to documentation updates backed by already-collected local evidence Rejected: Leave the smoke result only in transient chat output | The next session needs the proof captured in repo-native handoff files Confidence: high Scope-risk: narrow Directive: Keep treating the offline smoke as an integration proof, not as a substitute for real business-data validation Tested: Rechecked 183 relative links and documented the successful offline smoke summary already verified locally Not-tested: No new code path executed in this documentation-only checkpoint
Showing
3 changed files
with
38 additions
and
0 deletions
| 1 | ## 2026-06-02 业务导出离线 smoke 实跑通过 checkpoint | ||
| 2 | |||
| 3 | 完成项: | ||
| 4 | - 已实际运行 `acr-engine/scripts/business_export_offline_smoke.py`。 | ||
| 5 | - 已确认链路从业务导出样例 -> manifest-ready JSONL -> 项目 manifest -> `train.py --dry-run` 全部跑通。 | ||
| 6 | |||
| 7 | 验证结果: | ||
| 8 | - `input_rows=5` | ||
| 9 | - `output_rows=5` | ||
| 10 | - roles=`reference/query/excluded` | ||
| 11 | - buckets=`lossless_reference_core/short_video_hook/demo_variation_pool` | ||
| 12 | - `catalog_refs=2` | ||
| 13 | - `train_queries=1` | ||
| 14 | - `test_queries=1` | ||
| 15 | - `val_queries=0` | ||
| 16 | - `dry_run_passed=true` | ||
| 17 | |||
| 18 | 结论: | ||
| 19 | - 业务导出离线适配链已经具备真实可运行证据,而不只是模板与脚本集合。 | ||
| 20 | - 下个 session 可以直接替换成真实业务导出数据,沿同一链路继续推进。 | ||
| 21 | |||
| 1 | ## 2026-06-02 项目 manifest 适配脚本交付 checkpoint | 22 | ## 2026-06-02 项目 manifest 适配脚本交付 checkpoint |
| 2 | 23 | ||
| 3 | 完成项: | 24 | 完成项: | ... | ... |
| ... | @@ -99,3 +99,7 @@ cd /workspace/acr-engine | ... | @@ -99,3 +99,7 @@ cd /workspace/acr-engine |
| 99 | 2. 继续补 cap64 multi-seed,而不是只保留单 seed。 | 99 | 2. 继续补 cap64 multi-seed,而不是只保留单 seed。 |
| 100 | 3. 在 bucket 基线下继续优化 `hybrid` 波动,而不是过早锁定全局默认策略。 | 100 | 3. 在 bucket 基线下继续优化 `hybrid` 波动,而不是过早锁定全局默认策略。 |
| 101 | 4. 保持“文档更新 -> changelog -> commit -> push”的阶段节奏。 | 101 | 4. 保持“文档更新 -> changelog -> commit -> push”的阶段节奏。 |
| 102 | |||
| 103 | |||
| 104 | - 已新增 `acr-engine/scripts/business_export_offline_smoke.py`,并拿到端到端离线 smoke fresh evidence。 | ||
| 105 | - 已确认链路:业务导出样例 -> 规范化 -> 项目 manifest -> `train.py --dry-run`。 | ... | ... |
| ... | @@ -74,3 +74,16 @@ test -f /tmp/ab_smoke_seg_cap48_top2_seed999/report.json && cat /tmp/ab_smoke_se | ... | @@ -74,3 +74,16 @@ test -f /tmp/ab_smoke_seg_cap48_top2_seed999/report.json && cat /tmp/ab_smoke_se |
| 74 | - `hybrid`:`mean_top1=1.0, mean_num_queries=4.0` | 74 | - `hybrid`:`mean_top1=1.0, mean_num_queries=4.0` |
| 75 | - `high_energy`:`mean_top1=1.0, mean_num_queries=3.5` | 75 | - `high_energy`:`mean_top1=1.0, mean_num_queries=3.5` |
| 76 | - 这意味着 bucket baseline 已经可以作为后续“解释不同子集 winner 分化”的最小工程基础。 | 76 | - 这意味着 bucket baseline 已经可以作为后续“解释不同子集 winner 分化”的最小工程基础。 |
| 77 | |||
| 78 | |||
| 79 | ## 最新新增的实跑证据 | ||
| 80 | |||
| 81 | - 新增脚本:`acr-engine/scripts/business_export_offline_smoke.py` | ||
| 82 | - 已在本地真实可读音频上跑通: | ||
| 83 | - 业务导出样例 -> 规范化 -> 项目 manifest -> `train.py --dry-run` | ||
| 84 | - 关键结果: | ||
| 85 | - `catalog_refs=2` | ||
| 86 | - `train_queries=1` | ||
| 87 | - `test_queries=1` | ||
| 88 | - `val_queries=0` | ||
| 89 | - `dry_run_passed=true` | ... | ... |
-
Please register or sign in to post a comment