Preserve the observable build-index state so the next session can resume from the real bottleneck
Constraint: Long-running CPU-only chromaprint indexing has not reached evaluate yet Rejected: Keep appending linear refs_done updates | produces noise without a stage transition Confidence: high Scope-risk: narrow Directive: Do not create the next handoff commit until chromaprint completes, reference_* appears, evaluate starts, or the process fails Tested: Verified /tmp/chroma_index_observable_smoke progress snapshot; reviewed updated handoff/changelog files Not-tested: No new model/evaluation result because build-index has not reached the next stage
Showing
5 changed files
with
71 additions
and
54 deletions
| ... | @@ -72,40 +72,32 @@ | ... | @@ -72,40 +72,32 @@ |
| 72 | - `hybrid` 波动收敛 | 72 | - `hybrid` 波动收敛 |
| 73 | - 更接近商用的数据集组合评测 | 73 | - 更接近商用的数据集组合评测 |
| 74 | 74 | ||
| 75 | ## 5.5 最新真实 FMA smoke 运行态(2026-06-02) | 75 | ## 5.5 最新真实 FMA / chromaprint 运行态(2026-06-02) |
| 76 | 76 | ||
| 77 | ### 当前最新快照(13:36 UTC) | 77 | ### 当前最新快照(14:25 UTC) |
| 78 | 78 | ||
| 79 | - 远程同步基线:`c2d7820cdeebb142896916c0a03726521e5c09d8` | 79 | - 远程同步基线:`bc6d07afbd1e31d3956d20e35c20c424bc21ba99` |
| 80 | - 真实 FMA smoke 已完成训练,`best_model.pt` 与 `song_to_idx.json` 已生成。 | 80 | - 已推送完成: |
| 81 | - 当前最重要活跃阶段不是训练,而是: | 81 | - chromaprint `_find_peaks()` 等价优化 |
| 82 | - `run_demo.py build-index --data /tmp/fma_real_smoke_stopcheck/fma/manifests ...` | 82 | - chromaprint 建索引 observability |
| 83 | - 到 `2026-06-02 13:36 UTC`: | 83 | - 新 session 的主要监控对象应切到: |
| 84 | - `evaluate.py` 仍未出现 | 84 | - `PID=431703` |
| 85 | - `fma_index_smoke/` 目录已创建,但还没有索引产物文件证据 | 85 | - `/tmp/chroma_index_observable_smoke/chromaprint_progress.json` |
| 86 | - 因此新 session 不应重复排查训练;应优先盯住 `build-index -> evaluate` 的阶段切换。 | 86 | - `/tmp/chroma_index_observable_smoke/chromaprint.pkl` |
| 87 | 87 | - `2026-06-02 14:25:32 UTC` 证据: | |
| 88 | 88 | - `status=building` | |
| 89 | - 真实 FMA 数据已本地就绪:`acr-engine/data/raw/fma_small_audio/` | 89 | - `refs_done=1740/8000` |
| 90 | - 已验证: | 90 | - `elapsed_sec=1385.4` |
| 91 | - `num_audio_files=8000` | 91 | - `eta_sec=4984.254` |
| 92 | - `eligible_query_files=7994` | 92 | - `hashes=229127` |
| 93 | - `ready_for_smoke=true` | 93 | - `postings=1510952` |
| 94 | - 当前有一条真实 FMA 端到端 smoke 正在运行: | 94 | - 当前尚未出现 `reference_*` 或 `evaluate.py`,因此**还不能输出最终 accuracy 结论**。 |
| 95 | - 进程:`src/data/external_adapters.py smoke-local fma ...` | 95 | - 旧 `PID=424691` 真实 FMA 全量 build-index 进程仍在,但它是 observability 改动前启动的旧路径;不要把它当作新代码验证来源。 |
| 96 | - 输出:`/tmp/fma_real_smoke_stopcheck` | 96 | - 下一次值得提交的事件只应是: |
| 97 | - 训练子进程:`train.py --data /tmp/fma_real_smoke_stopcheck/fma/manifests ...` | 97 | 1. `chromaprint_progress.json status=complete` |
| 98 | - 最新 checkpoint(2026-06-02 12:09 UTC): | 98 | 2. `reference_*` 文件出现 |
| 99 | - `train.py` 仍在运行 | 99 | 3. `evaluate.py` 启动 |
| 100 | - `ELAPSED=12:00` | 100 | 4. 或明确失败 |
| 101 | - `catalog_references=8000` | ||
| 102 | - `train_queries=6401` | ||
| 103 | - `test_queries=1593` | ||
| 104 | - `fma_models_smoke/` 仍为空,这在当前实现中是正常现象,因为 `best_model.pt` 只会在 `Epoch 1` 结束后首次保存 | ||
| 105 | - 环境确认无 GPU: | ||
| 106 | - `nvidia-smi` 不可用 | ||
| 107 | - `torch.cuda.is_available() = false` | ||
| 108 | - 因此当前最真实的卡点不是 bug,而是 **CPU-only 真实 FMA smoke 耗时长**。 | ||
| 109 | 101 | ||
| 110 | ## 6. 高风险注意事项 | 102 | ## 6. 高风险注意事项 |
| 111 | 103 | ... | ... |
| 1 | ## 2026-06-02 14:25 UTC / restart-package handoff refresh | ||
| 2 | |||
| 3 | - 交付基线刷新为:`bc6d07afbd1e31d3956d20e35c20c424bc21ba99` | ||
| 4 | - 固化当前最重要运行证据:observable chromaprint smoke | ||
| 5 | - `PID=431703` | ||
| 6 | - `status=building` | ||
| 7 | - `refs_done=1740/8000` | ||
| 8 | - `hashes=229127` | ||
| 9 | - `postings=1510952` | ||
| 10 | - 明确旧真实 FMA build-index 进程仅作背景运行态,不再作为新 observability 代码验证来源 | ||
| 11 | - 重写交付/交接文档,便于新 session 直接从 `chromaprint -> reference_* -> evaluate` 阶段继续 | ||
| 12 | - 约束保持不变:不提交 `data/raw`、`data/external_smoke`、`/tmp`、checkpoint、`__pycache__` | ||
| 13 | |||
| 1 | ## 2026-06-02 chromaprint build-index observability checkpoint | 14 | ## 2026-06-02 chromaprint build-index observability checkpoint |
| 2 | 15 | ||
| 3 | 完成项: | 16 | 完成项: | ... | ... |
This diff is collapsed.
Click to expand it.
This diff is collapsed.
Click to expand it.
| ... | @@ -5,26 +5,38 @@ | ... | @@ -5,26 +5,38 @@ |
| 5 | 5 | ||
| 6 | ## 一页结论 | 6 | ## 一页结论 |
| 7 | 7 | ||
| 8 | ### 最新交付快照(2026-06-02 13:36 UTC) | 8 | ### 最新交付快照(2026-06-02 14:25 UTC) |
| 9 | 9 | ||
| 10 | - 当前远程同步基线:`c2d7820cdeebb142896916c0a03726521e5c09d8` | 10 | - 当前远程同步基线:`bc6d07afbd1e31d3956d20e35c20c424bc21ba99` |
| 11 | - 真实 FMA 全量 smoke **已经完成训练**,并已产出: | 11 | - 已正式交付的最新代码能力: |
| 12 | - `/tmp/fma_real_smoke_stopcheck/fma_models_smoke/best_model.pt` | 12 | - chromaprint `_find_peaks()` 等价加速 |
| 13 | - `/tmp/fma_real_smoke_stopcheck/fma_models_smoke/song_to_idx.json` | 13 | - chromaprint 建索引进度可观测化 |
| 14 | - 当前主流程仍停留在 **`run_demo.py build-index`**: | 14 | - `run_demo.py --chromaprint-checkpoint-every-refs` |
| 15 | - `PID=311494`:`external_adapters.py smoke-local ...` | 15 | - 当前最重要的 live evidence 不再是旧全量 FMA 进程,而是**新的 observable chromaprint smoke**: |
| 16 | - `PID=424691`:`run_demo.py build-index --data /tmp/fma_real_smoke_stopcheck/fma/manifests ...` | 16 | - `PID=431703` |
| 17 | - 截至 `2026-06-02 13:36 UTC`: | 17 | - 输出目录:`/tmp/chroma_index_observable_smoke` |
| 18 | - 仍未观测到 `evaluate.py` | 18 | - `2026-06-02 14:25:32 UTC` 最新观测: |
| 19 | - `/tmp/fma_real_smoke_stopcheck/fma_index_smoke/` 已存在,但尚未看到索引产物文件 | 19 | - `status=building` |
| 20 | - manifest 再校验仍通过:`catalog_references=8000`, `train_queries=6401`, `test_queries=1593`, `ok=true` | 20 | - `refs_done=1740 / 8000` |
| 21 | - 结论:当前不是训练卡死,而是 **CPU-only 全量真实 FMA 在长时间建索引**。 | 21 | - `elapsed_sec=1385.4` |
| 22 | - 下一关键证据只有两个: | 22 | - `eta_sec=4984.254` |
| 23 | 1. 首个 index artifact 出现 | 23 | - `hashes=229127` |
| 24 | 2. 主流程切换到 `evaluate.py` | 24 | - `postings=1510952` |
| 25 | 25 | - `chromaprint.pkl=16787221 bytes` | |
| 26 | 这是一个正在从原型向工业化推进的 **音乐 ACR / music retrieval** 项目。 | 26 | - 当前尚未出现: |
| 27 | 当前已经完成: | 27 | - `reference_progress.json` |
| 28 | - `reference_embs.partial.npy` | ||
| 29 | - `reference_ids.partial.npy` | ||
| 30 | - `reference_embs.npy` | ||
| 31 | - `reference_ids.npy` | ||
| 32 | - `evaluate.py` | ||
| 33 | - 旧真实 FMA 全量进程 `PID=424691` 仍在运行,但它启动于 observability 改动前,**不要作为新代码路径的验证证据**。 | ||
| 34 | - 结论:当前不是训练问题,也不是新逻辑无证据;当前只是 **CPU-only chromaprint build-index 仍在稳定推进,尚未阶段切换**。 | ||
| 35 | - 下一次值得更新文档/提交的事件只有四种: | ||
| 36 | 1. `chromaprint_progress.json` 变为 `status=complete` | ||
| 37 | 2. 任一 `reference_*` 文件出现 | ||
| 38 | 3. `evaluate.py` 启动 | ||
| 39 | 4. 进程报错退出 | ||
| 28 | 40 | ||
| 29 | 这是一个正在从原型向工业化推进的 **音乐 ACR / music retrieval** 项目。 | 41 | 这是一个正在从原型向工业化推进的 **音乐 ACR / music retrieval** 项目。 |
| 30 | 当前已经完成: | 42 | 当前已经完成: | ... | ... |
-
Please register or sign in to post a comment