Commit 21388b99 21388b9963b28f79fbbfd8837ed690ab7e4d2d86 by cnb.bofCdSsphPA

Why the handoff needs the exact semantic adapter insertion point

Constraint: The next session should not spend time rediscovering where the real MERT/MuQ adapter belongs in the song-centric pipeline
Rejected: Leave the adapter step as a generic future task | does not identify the concrete file and function to change
Confidence: high
Scope-risk: narrow
Directive: Keep future semantic-adapter handoffs anchored on enrich_songcentric_manifest_with_local_features.py unless the host pipeline entrypoint changes
Tested: markdown link check on /workspace/docs after adding the semantic adapter handoff note
Not-tested: No runtime install or adapter implementation yet; this commit records the verified insertion point only
1 parent e6c2e0a1
...@@ -158,3 +158,51 @@ flowchart TD ...@@ -158,3 +158,51 @@ flowchart TD
158 ## 一句话 handoff 158 ## 一句话 handoff
159 159
160 > 下次不要再从总方案争论开始,直接跑 song-centric runner;如果 exact 正常、semantic 仍 fallback,就继续补真实 semantic adapter 和依赖。 160 > 下次不要再从总方案争论开始,直接跑 song-centric runner;如果 exact 正常、semantic 仍 fallback,就继续补真实 semantic adapter 和依赖。
161
162 ---
163
164 ## 10. 真实 semantic adapter 下一步应该接到哪里
165
166 当前最直接的接入点已经明确:
167
168 - 入口脚本:`acr-engine/scripts/enrich_songcentric_manifest_with_local_features.py`
169 - 关键函数:`build_semantic_feature(...)`
170
171 ### 当前真实状态
172
173 - exact lane 已优先复用 `ChromaprintMatcher`
174 - semantic lane 还没有真实接入 `MERT / MuQ`
175 - runtime 就绪时,当前仍只会产出:
176 - `model_name = semantic_runtime_ready_placeholder`
177 - runtime 不就绪时,会走:
178 - `model_name = local_wavehash_embed`
179
180 ### fresh 依赖检查事实
181
182 当前 host 仍缺:
183 - `torch`
184 - `torchaudio`
185 - `transformers`
186
187 ### 下次 session 最直接的实现顺序
188
189 1. 安装 `torch / torchaudio / transformers`
190 2.`build_semantic_feature(...)` 内接真实 `MERT``MuQ` adapter
191 3. 保留当前 `local_wavehash_embed` fallback 不删
192 4. 重跑:
193
194 ```bash
195 cd /workspace
196 /usr/local/miniconda3/bin/python acr-engine/scripts/run_songcentric_directory_pipeline_live.py \
197 --dsn 'postgres://d2:d2pass@127.0.0.1:5432/d2' \
198 --schema acr_songcentric_test \
199 --input-root acr-engine/data/songcentric_builder_smoke \
200 --output-dir acr-engine/data/pgvector_eval/music20
201 ```
202
203 ### 期望看到的 fresh 指标变化
204
205 - `semantic_runtime_available = true`
206 - `semantic_runtime_ready_count > 0`
207 - `semantic_fallback_count` 明显下降或归零
208
......