Why the handoff should include a concrete live feature lineage example
Constraint: Future sessions need a zero-ambiguity PostgreSQL example that matches the current live song-centric pipeline Rejected: Only describing lineage abstractly | forces re-verification every session Confidence: high Scope-risk: narrow Directive: Prefer concrete live feature/window/asset/song examples in handoff docs whenever the default path changes Tested: markdown link check under docs; live SQL verification for feature_id=34 lineage; manifest feature sample extraction Not-tested: Re-running the full directory pipeline in this commit
Showing
3 changed files
with
47 additions
and
0 deletions
| ... | @@ -3,6 +3,7 @@ | ... | @@ -3,6 +3,7 @@ |
| 3 | ## 2026-06-04 | 3 | ## 2026-06-04 |
| 4 | - 继续收敛文档到当前 live 主链口径:补齐 `feature_fact.object_id -> audio_object(window)`、`window.parent_object_id -> asset`、`feature_fact.song_id -> media_entity(song)` 的绑定说明,并新增 manifest/SQL 双样例,专门回答 Phase-1 开源模型集合应该如何落地存储以及 feature 与 audio object 如何关联。 | 4 | - 继续收敛文档到当前 live 主链口径:补齐 `feature_fact.object_id -> audio_object(window)`、`window.parent_object_id -> asset`、`feature_fact.song_id -> media_entity(song)` 的绑定说明,并新增 manifest/SQL 双样例,专门回答 Phase-1 开源模型集合应该如何落地存储以及 feature 与 audio object 如何关联。 |
| 5 | - 修正 `docs/session-handoff.md` 中关于 semantic lane 的旧状态残留,统一到当前真实事实:live 默认已落 `chromaprint_matcher + mert-v1-95m`,MuQ 仍是下一阶段 challenger。 | 5 | - 修正 `docs/session-handoff.md` 中关于 semantic lane 的旧状态残留,统一到当前真实事实:live 默认已落 `chromaprint_matcher + mert-v1-95m`,MuQ 仍是下一阶段 challenger。 |
| 6 | - 继续补充可复核的 live 样例:把 `feature_id = 34 -> window_id = 22 -> asset_id = 20 -> song_beta` 的真实 PostgreSQL 回溯结果写入 handoff 与 schema sample 文档,方便下次 session 直接人工复核绑定链路。 | ||
| 6 | 7 | ||
| 7 | ## 2026-06-04 | 8 | ## 2026-06-04 |
| 8 | - fresh runtime 进展:已在当前 host 成功安装 `torch-2.12.0+cpu`、`torchaudio-2.11.0+cpu` 与 `transformers-5.10.1`,重跑 song-centric 主链后确认 `semantic_runtime_available = true`、`semantic_runtime_ready_count = 5`、`semantic_fallback_count = 0`;当前 semantic 已从 fallback 推进到 `mert-v1-95m`,下一步可在不破坏当前 MERT 基线的前提下继续接 `MuQ` adapter。 | 9 | - fresh runtime 进展:已在当前 host 成功安装 `torch-2.12.0+cpu`、`torchaudio-2.11.0+cpu` 与 `transformers-5.10.1`,重跑 song-centric 主链后确认 `semantic_runtime_available = true`、`semantic_runtime_ready_count = 5`、`semantic_fallback_count = 0`;当前 semantic 已从 fallback 推进到 `mert-v1-95m`,下一步可在不破坏当前 MERT 基线的前提下继续接 `MuQ` adapter。 | ... | ... |
| ... | @@ -752,6 +752,29 @@ where ff.feature_id = :feature_id; | ... | @@ -752,6 +752,29 @@ where ff.feature_id = :feature_id; |
| 752 | 3. 用 `parent_object_id` 找到它所属的 `asset` | 752 | 3. 用 `parent_object_id` 找到它所属的 `asset` |
| 753 | 4. 用 `song_id` 找到最终归属的 `song` | 753 | 4. 用 `song_id` 找到最终归属的 `song` |
| 754 | 754 | ||
| 755 | ### 14.1 一个当前 live 的真实结果 | ||
| 756 | |||
| 757 | 当前 PostgreSQL `acr_songcentric_test` 中,`feature_id = 34` 的真实回溯结果是: | ||
| 758 | |||
| 759 | ```text | ||
| 760 | feature_id = 34 | ||
| 761 | feature_type = embedding | ||
| 762 | model_name = mert-v1-95m | ||
| 763 | model_version = hf-main | ||
| 764 | feature_set_name = mert_5s_hop2.5_v1 | ||
| 765 | window_id = 22 | ||
| 766 | window_range = 1000-6000 ms | ||
| 767 | asset_id = 20 | ||
| 768 | asset_uri = /workspace/acr-engine/data/songcentric_builder_smoke/song_beta/artist_b/clip2.wav | ||
| 769 | song_id = 9 | ||
| 770 | song_biz_key = song_beta | ||
| 771 | ``` | ||
| 772 | |||
| 773 | 这条 live 结果说明: | ||
| 774 | - 当前真实 semantic baseline 已经是 `mert-v1-95m` | ||
| 775 | - 一条 embedding feature 可以被精确回溯到具体 `window/asset/song` | ||
| 776 | - 这正是当前版权保护链路里“快速定位 song_id”的最小证据闭环 | ||
| 777 | |||
| 755 | ## 15. 一个完整的多 asset / 多 window / 多 model 样例 | 778 | ## 15. 一个完整的多 asset / 多 window / 多 model 样例 |
| 756 | 779 | ||
| 757 | 假设: | 780 | 假设: | ... | ... |
| ... | @@ -196,6 +196,29 @@ flowchart TD | ... | @@ -196,6 +196,29 @@ flowchart TD |
| 196 | - 当前 blocker:`import muq` 触发 `RuntimeError: operator torchvision::nms does not exist` | 196 | - 当前 blocker:`import muq` 触发 `RuntimeError: operator torchvision::nms does not exist` |
| 197 | - 结论:MuQ 仍是下一阶段 challenger,不是当前 live 默认基线 | 197 | - 结论:MuQ 仍是下一阶段 challenger,不是当前 live 默认基线 |
| 198 | 198 | ||
| 199 | ### 一个可直接复核的 live 样例 | ||
| 200 | |||
| 201 | 当前可直接用 `feature_id = 34` 做人工复核: | ||
| 202 | |||
| 203 | ```text | ||
| 204 | feature_id = 34 | ||
| 205 | feature_type = embedding | ||
| 206 | model_name = mert-v1-95m | ||
| 207 | model_version = hf-main | ||
| 208 | feature_set_name = mert_5s_hop2.5_v1 | ||
| 209 | window_id = 22 | ||
| 210 | window_range = 1000-6000 ms | ||
| 211 | asset_id = 20 | ||
| 212 | asset_uri = /workspace/acr-engine/data/songcentric_builder_smoke/song_beta/artist_b/clip2.wav | ||
| 213 | song_id = 9 | ||
| 214 | song_biz_key = song_beta | ||
| 215 | ``` | ||
| 216 | |||
| 217 | 这条样例可以非常直观地证明: | ||
| 218 | - feature 不是直接挂 song,而是先挂到 `window` | ||
| 219 | - `window` 通过 `parent_object_id` 回到 `asset` | ||
| 220 | - 最终通过 `song_id` 回到 `song_beta` | ||
| 221 | |||
| 199 | ### 当前 manifest 形状(导入前) | 222 | ### 当前 manifest 形状(导入前) |
| 200 | 223 | ||
| 201 | ```json | 224 | ```json | ... | ... |
-
Please register or sign in to post a comment