pin hum_focus as the current dual-axis search anchor\n\nConstraint: Keep the han…
…doff restart-safe and avoid staging temporary sweep artifacts\nRejected: Switch back to v6 or continue blind search | Fresh evidence shows hum_focus is the current best candidate and the right anchor for finer tuning\nConfidence: high\nScope-risk: narrow\nDirective: Use hum_focus as the baseline for the next micro-search, preserving humming_like gains while keeping confused at 0.25\nTested: Verified hum_focus versus hum_balanced with fresh eval results and updated docs accordingly\nNot-tested: Whether a further micro-tuned variant beats hum_focus
Showing
5 changed files
with
93 additions
and
22 deletions
| ... | @@ -74,15 +74,20 @@ | ... | @@ -74,15 +74,20 @@ |
| 74 | 74 | ||
| 75 | ## 5.5 最新真实 FMA / chromaprint 运行态(2026-06-02) | 75 | ## 5.5 最新真实 FMA / chromaprint 运行态(2026-06-02) |
| 76 | 76 | ||
| 77 | ### 当前最新快照(15:56 UTC) | 77 | ### 当前最新快照(16:03 UTC) |
| 78 | 78 | ||
| 79 | - 远程同步基线:`6279850`(更新前) | 79 | - 远程同步基线:`9c3f182`(更新前) |
| 80 | - 当前最重要的新证据:**dual-axis smoke 已完成首轮端到端评测,但当前组合未改善 humming_like**。 | 80 | - 当前最重要的新证据:**dual-axis 候选已收敛到 hum_focus**。 |
| 81 | - 结果:`top1=0.5`, `topk=0.9`, `humming_like=0.0`, `confused=0.25` | 81 | - `hum_focus` 当前优于 `hum_balanced` 与 `v6`: |
| 82 | - 这说明:dual-axis 入口是通的,但当前权重组合不是更优解。 | 82 | - `top1=0.7` |
| 83 | - `topk=0.85` | ||
| 84 | - `humming_like=0.5` | ||
| 85 | - `confused=0.25` | ||
| 86 | - `hum_balanced` 只回到 `v6` 水平,未超过 `hum_focus`。 | ||
| 87 | - 这说明:下一轮最值得做的是围绕 `hum_focus` 小步微调,而不是回到更粗粒度搜索。 | ||
| 83 | - 下一次值得提交的事件: | 88 | - 下一次值得提交的事件: |
| 84 | 1. 更细粒度的 dual-axis 权重搜索结果 | 89 | 1. `hum_focus` 微调后的更优权重组合 |
| 85 | 2. `humming_like` 回升且 `confused` 不掉的组合 | 90 | 2. `humming_like` 继续保持高位,同时 `confused` 不掉 |
| 86 | 3. dual-track 回归验证改善结果 | 91 | 3. dual-track 回归验证改善结果 |
| 87 | 92 | ||
| 88 | 93 | ... | ... |
| 1 | ## 2026-06-02 16:03 UTC / hum_focus pinned as the current best dual-axis candidate | ||
| 2 | |||
| 3 | - 在 dual-axis 搜索中继续比较 `hum_focus` 与 `hum_balanced`,拿到更细粒度的 fresh evidence | ||
| 4 | - fresh evidence(`2026-06-02 16:03:13 UTC`): | ||
| 5 | - `hum_focus`:`top1=0.7`, `topk=0.85`, `humming_like=0.5`, `confused=0.25` | ||
| 6 | - `hum_balanced`:`top1=0.65`, `topk=0.95`, `humming_like=0.25`, `confused=0.25` | ||
| 7 | - 结论: | ||
| 8 | - `hum_focus` 当前是这轮搜索的最佳候选 | ||
| 9 | - 它把 `humming_like` 从 `0.0` 拉回到 `0.5`,同时保住 `confused=0.25` | ||
| 10 | - `hum_balanced` 只回到 `v6` 水平,未超越 `hum_focus` | ||
| 11 | - 下一步最值得做的是围绕 `hum_focus` 微调,而不是退回到 `v6` 或继续盲搜 | ||
| 12 | |||
| 1 | ## 2026-06-02 15:56 UTC / dual-axis smoke completed first end-to-end eval | 13 | ## 2026-06-02 15:56 UTC / dual-axis smoke completed first end-to-end eval |
| 2 | 14 | ||
| 3 | - 以新的 dual-axis 配置跑通了一轮端到端 smoke:`train -> build-index -> evaluate` | 15 | - 以新的 dual-axis 配置跑通了一轮端到端 smoke:`train -> build-index -> evaluate` | ... | ... |
| ... | @@ -323,3 +323,24 @@ | ... | @@ -323,3 +323,24 @@ |
| 323 | 323 | ||
| 324 | - 目前这组 dual-axis 配置证明了“可配置实验链路”是通的。 | 324 | - 目前这组 dual-axis 配置证明了“可配置实验链路”是通的。 |
| 325 | - 但它没有带来 `humming_like` 改善,说明后续搜索需要更细:该拆分 `sample_type_weights` 与 `pair_type_weights` 的取值粒度。 | 325 | - 但它没有带来 `humming_like` 改善,说明后续搜索需要更细:该拆分 `sample_type_weights` 与 `pair_type_weights` 的取值粒度。 |
| 326 | |||
| 327 | ## 本次追加交付(2026-06-02 16:03 UTC) | ||
| 328 | |||
| 329 | ### 新增运行证据 | ||
| 330 | |||
| 331 | | 候选 | top1 | topk | humming_like top1 | confused top1 | 结论 | | ||
| 332 | |---|---:|---:|---:|---:|---| | ||
| 333 | | hum_focus | 0.7 | 0.85 | 0.5 | 0.25 | 当前最优 | | ||
| 334 | | hum_balanced | 0.65 | 0.95 | 0.25 | 0.25 | 只回到 v6 水平 | | ||
| 335 | |||
| 336 | ### 当前最重要的 fresh evidence | ||
| 337 | |||
| 338 | - 观测时间:`2026-06-02 16:03:13 UTC` | ||
| 339 | - `hum_focus` 结果文件:`/tmp/dualaxis_sweep/hum_focus/eval.json` | ||
| 340 | - `hum_balanced` 结果文件:`/tmp/dualaxis_sweep/hum_balanced/eval.json` | ||
| 341 | - 对比结论:`hum_focus` 在 `humming_like` 上优于 `hum_balanced`,且总体更优。 | ||
| 342 | |||
| 343 | ### 结论 | ||
| 344 | |||
| 345 | - 当前 dual-axis 线的最佳候选已收敛为 `hum_focus`。 | ||
| 346 | - 下一轮应围绕 `hum_focus` 做微调搜索,而不是回退到 `v6` 或扩大盲搜范围。 | ... | ... |
| 1 | ## 本次交付包追加更新(2026-06-02 16:03 UTC) | ||
| 2 | |||
| 3 | ### 交付结论 | ||
| 4 | |||
| 5 | 当前最新里程碑已经从“dual-axis 首轮可跑通”推进到 **dual-axis 候选已收敛到 hum_focus**: | ||
| 6 | - 远程基线当前为:`9c3f182`(更新前) | ||
| 7 | - `hum_focus` 当前优于 `hum_balanced` 与 `v6` 基线 | ||
| 8 | - 因此下一轮应围绕 `hum_focus` 做微调,而不是回退或盲搜 | ||
| 9 | |||
| 10 | ### 当前最新事实 | ||
| 11 | |||
| 12 | #### dual-axis 对比结果 | ||
| 13 | - `hum_focus`: | ||
| 14 | - `top1=0.7` | ||
| 15 | - `topk=0.85` | ||
| 16 | - `humming_like=0.5` | ||
| 17 | - `confused=0.25` | ||
| 18 | - `hum_balanced`: | ||
| 19 | - `top1=0.65` | ||
| 20 | - `topk=0.95` | ||
| 21 | - `humming_like=0.25` | ||
| 22 | - `confused=0.25` | ||
| 23 | |||
| 24 | ### 当前判断 | ||
| 25 | |||
| 26 | - `hum_focus` 是目前最值得继续迭代的 dual-axis 起点。 | ||
| 27 | - 下一阶段建议是以 `hum_focus` 为锚点做小步搜索,优先保住 `humming_like` 优势。 | ||
| 28 | |||
| 29 | --- | ||
| 30 | |||
| 1 | ## 本次交付包追加更新(2026-06-02 15:56 UTC) | 31 | ## 本次交付包追加更新(2026-06-02 15:56 UTC) |
| 2 | 32 | ||
| 3 | ### 交付结论 | 33 | ### 交付结论 | ... | ... |
| ... | @@ -5,23 +5,26 @@ | ... | @@ -5,23 +5,26 @@ |
| 5 | 5 | ||
| 6 | ## 一页结论 | 6 | ## 一页结论 |
| 7 | 7 | ||
| 8 | ### 最新交付快照(2026-06-02 15:56 UTC) | 8 | ### 最新交付快照(2026-06-02 16:03 UTC) |
| 9 | 9 | ||
| 10 | - 当前远程同步基线:`6279850`(更新前) | 10 | - 当前远程同步基线:`9c3f182`(更新前) |
| 11 | - 当前最重要的新事实:**dual-axis smoke 已完成首轮端到端评测,但当前组合未改善 humming_like** | 11 | - 当前最重要的新事实:**dual-axis 候选已收敛到 hum_focus** |
| 12 | - 结果: | 12 | - `hum_focus`: |
| 13 | - `num_queries=20` | 13 | - `top1=0.7` |
| 14 | - `top1=0.5` | 14 | - `topk=0.85` |
| 15 | - `topk=0.9` | 15 | - `humming_like=0.5` |
| 16 | - `humming_like=0.0` | 16 | - `confused=0.25` |
| 17 | - `hum_balanced`: | ||
| 18 | - `top1=0.65` | ||
| 19 | - `topk=0.95` | ||
| 20 | - `humming_like=0.25` | ||
| 17 | - `confused=0.25` | 21 | - `confused=0.25` |
| 18 | - 结论: | 22 | - 结论: |
| 19 | - dual-axis 入口已可用 | 23 | - `hum_focus` 当前是这轮双轴搜索的最佳候选 |
| 20 | - 但当前权重组合不是更优解 | 24 | - 下一轮应围绕 `hum_focus` 微调,而不是回到 `v6` 或继续盲搜 |
| 21 | - 下一轮应做更细粒度的权重搜索 | ||
| 22 | - 新 session 第一优先级: | 25 | - 新 session 第一优先级: |
| 23 | 1. 继续搜索 `sample_type_weights` / `pair_type_weights` | 26 | 1. 围绕 `hum_focus` 做小步权重搜索 |
| 24 | 2. 目标是把 `humming_like` 拉回到至少 `v6` 水平,同时不丢 `confused` | 27 | 2. 优先保住 `humming_like` 优势 |
| 25 | 3. 再做 real-path clean + synthetic hard-case 双轨复测 | 28 | 3. 再做 real-path clean + synthetic hard-case 双轨复测 |
| 26 | 29 | ||
| 27 | ### 最新可观测性修复(2026-06-02 15:18 UTC) | 30 | ### 最新可观测性修复(2026-06-02 15:18 UTC) | ... | ... |
-
Please register or sign in to post a comment