session-handoff.md 6.67 KB

Raw Blame History Permalink



Session Handoff / 持续开发交接文档


目标：让下次启动的新 session 在 3~10 分钟内 知道从哪里开始。


1. 下次启动先做什么

优先直接跑当前主线：

cd /workspace
/usr/local/miniconda3/bin/python acr-engine/scripts/run_songcentric_directory_pipeline_live.py \
  --dsn 'postgres://d2:d2pass@127.0.0.1:5432/d2' \
  --schema acr_songcentric_test \
  --input-root acr-engine/data/songcentric_builder_smoke \
  --output-dir acr-engine/data/pgvector_eval/music20


或：

acr-engine/scripts/start_songcentric_shortest_path.sh 'postgres://d2:d2pass@127.0.0.1:5432/d2'


当前 fresh evidence：


song_count = 2
asset_count = 2
window_count = 5
matcher_fingerprint_count = 5
fallback_fingerprint_count = 0
semantic_runtime_available = true
semantic_runtime_missing = []
semantic_runtime_ready_count = 5
semantic_fallback_count = 0
import_counts = media_entity:9 / audio_object:22 / feature_fact:34 / set_membership:9


2. 当前一句话状态


4 表 song-centric schema 已在 live PostgreSQL 上真实打通了“真实目录 -> 切片 -> exact/semantic feature enrichment -> import -> feature_fact”的宿主链。


下一步最应该做的是：


在不破坏这条宿主链的前提下，把 semantic lane 从 runtime-aware fallback 升级到真实 MERT / MuQ adapter。


3. 当前稳定结论


3.1 默认物理模型

media_entity -> audio_object -> feature_fact -> set_membership


3.2 默认逻辑语义

song -> asset -> window -> fingerprint / embedding


3.3 关键设计取舍


最终归属对象当前只要求稳定返回 song_id

同一个 song 下允许多个音频文件

window 仍保留，因为它是切片/evidence/offset/召回最小单元

feature_fact 统一承载 fingerprint 与 embedding

Phase-1 不先训练/微调，先直接复用开源 encoder


4. 切片 / 模型 / feature 分别在哪张表


对象
表
关键字段


song
media_entity
entity_type='song'


asset
audio_object
object_type='asset'


window
audio_object

object_type='window', parent_object_id=<asset_id>


model identity
feature_fact

model_name, model_version, feature_set_name


fingerprint payload
feature_fact

feature_type='fingerprint', fingerprint_value


embedding payload
feature_fact

feature_type='embedding', embedding_uri/vector_table_name, embedding_dim


set routing
set_membership

set_type, set_name, member_type, member_id


5. 当前流程图

flowchart TD
    A[song / media_entity] --> B[asset / audio_object]
    B --> C[window / audio_object]
    C --> D1[fingerprint / feature_fact]
    C --> D2[embedding / feature_fact]
    A --> E[set_membership]
    B --> E
    C --> E
    D1 --> F[召回与归属到 song_id]
    D2 --> F


6. 当前已经真实验证过什么


live PostgreSQL


DSN: postgres://d2:d2pass@127.0.0.1:5432/d2

schema: acr_songcentric_test


已验证链路


acr-engine/sql/acr_pg_schema_songcentric_v1.sql 可真实建表

bootstrap_songcentric_phase1_live.py 可重复 seed

import_songcentric_manifest_live.py 可幂等导入 song/asset/window/membership

manifest 中 windows[].features[] 已可直接落 feature_fact

真实目录 -> manifest -> import 已验证通过
真实目录 -> fingerprint enrichment -> import 已验证通过
exact lane 已优先复用仓库内 ChromaprintMatcher

semantic lane 已 runtime-ready，当前 host 已可进入 placeholder runtime 分支


7. 当前 host 的真实 blocker


torch / torchaudio / transformers 已可导入
当前 semantic_runtime_available = true

当前 semantic 已接上真实 mert-v1-95m baseline


这说明当前主要 blocker 已从“依赖缺失”推进为：


runtime 已就绪，真实 MERT baseline 已接入，下一步可继续接 MuQ。


当前更具体的 MuQ 目标名可优先按下面口径尝试：


Hugging Face / 代码线索：OpenMuQ/MuQ-large-msd-iter

官方加载入口：from muq import MuQ + MuQ.from_pretrained("OpenMuQ/MuQ-large-msd-iter")

仓库现有 Phase-1 任务线索：muq + large-msd-iter


8. 下次继续时先看哪些文件


README.md
start-here.md
postgresql-data-model.md
postgres_db_schema_samples.md
CHANGELOG.md


关键代码：


acr-engine/sql/acr_pg_schema_songcentric_v1.sql
acr-engine/scripts/run_songcentric_directory_pipeline_live.py
acr-engine/scripts/build_songcentric_manifest_from_directory.py
acr-engine/scripts/enrich_songcentric_manifest_with_local_features.py
acr-engine/scripts/import_songcentric_manifest_live.py
acr-engine/scripts/start_songcentric_shortest_path.sh


9. 下一步优先顺序


保持当前 4 表 schema 不回退
给 enrich_songcentric_manifest_with_local_features.py 接真实 semantic adapter
保留 fallback 分支，不破坏当前 host 的可运行性
重新跑主链 runner，确认 semantic lane 有 fresh 证据


一句话 handoff


下次不要再从总方案争论开始，直接跑 song-centric runner；如果 exact 正常、semantic 仍 fallback，就继续补真实 semantic adapter 和依赖。


10. 真实 semantic adapter 下一步应该接到哪里

当前最直接的接入点已经明确：


入口脚本：acr-engine/scripts/enrich_songcentric_manifest_with_local_features.py

关键函数：build_semantic_feature(...)


当前真实状态


exact lane 已优先复用 ChromaprintMatcher

semantic lane 还没有真实接入 MERT / MuQ

runtime 就绪时，当前会产出：


model_name = mert-v1-95m


fallback 分支仍保留：


model_name = local_wavehash_embed


fresh 依赖检查事实

当前 host 仍缺：


torch
torchaudio
transformers


下次 session 最直接的实现顺序


安装 torch / torchaudio / transformers

在 build_semantic_feature(...) 内接真实 MERT 或 MuQ adapter
保留当前 local_wavehash_embed fallback 不删
重跑：


cd /workspace
/usr/local/miniconda3/bin/python acr-engine/scripts/run_songcentric_directory_pipeline_live.py \
  --dsn 'postgres://d2:d2pass@127.0.0.1:5432/d2' \
  --schema acr_songcentric_test \
  --input-root acr-engine/data/songcentric_builder_smoke \
  --output-dir acr-engine/data/pgvector_eval/music20


期望看到的 fresh 指标变化


semantic_runtime_available = true
semantic_runtime_ready_count > 0

semantic_fallback_count 明显下降或归零