Commit 020702cc 020702cc4cb314fdd30174c1a3e696584d196306 by cnb.bofCdSsphPA

Promote the one-command song-centric runner as the default handoff path

Constraint: Reduce resume friction for future sessions by making the current live-validated runner the first-class entrypoint in docs and artifacts.
Rejected: Keep the active song-centric workflow scattered across multiple lower-level commands in the handoff docs | It slows recovery and increases cognitive overhead.
Confidence: high
Scope-risk: narrow
Directive: For current development, start with run_songcentric_directory_pipeline_live.py before dropping to the lower-level builder/enricher/importer commands.
Tested: /usr/local/miniconda3/bin/python acr-engine/scripts/run_songcentric_directory_pipeline_live.py --dsn postgres://d2:d2pass@127.0.0.1:5432/d2 --schema acr_songcentric_test --input-root acr-engine/data/songcentric_builder_smoke --output-dir acr-engine/data/pgvector_eval/music20; git diff --check; /usr/local/miniconda3/bin/python scripts/check_markdown_links.py --root docs returned OK for 11 active markdown files
Not-tested: alternate input roots beyond the current smoke directory
1 parent 3b4b3684
## 2026-06-04
-`run_songcentric_directory_pipeline_live.py` 提升为当前默认主线入口,并把 fresh runner 结果同步到 `docs/README.md``docs/start-here.md``docs/session-handoff.md`,降低下次 session 的恢复成本。
- 新增 `acr-engine/scripts/run_songcentric_directory_pipeline_live.py`,把“真实目录 -> manifest -> 特征补全 -> live PostgreSQL 导入”收敛为一条可重复执行的 runner,并输出 exact/semantic backend 选择与导入计数摘要。
- 升级 `enrich_songcentric_manifest_with_local_features.py` 为 runtime-aware 语义适配器选择:当前 host 上因缺少 `torch/torchaudio/transformers`,semantic lane 明确写入 `local_wavehash_embed` fallback,并把缺失依赖固化到 report/metadata 中。
......
......@@ -6,6 +6,19 @@
## 0. 新同学先做什么
如果当前要继续 song-centric 主线,先跑:
```bash
cd /workspace
/usr/local/miniconda3/bin/python acr-engine/scripts/run_songcentric_directory_pipeline_live.py \
--dsn 'postgres://d2:d2pass@127.0.0.1:5432/d2' \
--schema acr_songcentric_test \
--input-root acr-engine/data/songcentric_builder_smoke \
--output-dir acr-engine/data/pgvector_eval/music20
```
如果要回归旧的 planner/worker 合同,再跑:
```bash
cd /workspace/acr-engine
/usr/local/miniconda3/bin/python scripts/run_planner_validation_commands_live.py \
......
......@@ -11,7 +11,27 @@
## 0. 下次启动先做什么
先执行:
如果下次启动要继续当前主线(**song-centric 真实目录 -> feature -> PostgreSQL**),先执行:
```bash
cd /workspace
/usr/local/miniconda3/bin/python acr-engine/scripts/run_songcentric_directory_pipeline_live.py \
--dsn 'postgres://d2:d2pass@127.0.0.1:5432/d2' \
--schema acr_songcentric_test \
--input-root acr-engine/data/songcentric_builder_smoke \
--output-dir acr-engine/data/pgvector_eval/music20
```
当前 fresh evidence:
- `song_count = 2`
- `window_count = 5`
- `matcher_fingerprint_count = 5`
- `fallback_fingerprint_count = 0`
- `semantic_runtime_available = false`
- `semantic_runtime_missing = [torch, torchaudio, transformers]`
- `import_counts = media_entity:9 / audio_object:22 / feature_fact:24 / set_membership:9`
如果只是回归历史 Phase-1 planner/worker 合同,再执行:
```bash
cd /workspace/acr-engine
......
......@@ -6,6 +6,27 @@
## 1. 先执行这条命令
如果当前目标是验证 **song-centric 真实目录 -> feature -> PostgreSQL** 主链,优先跑:
```bash
cd /workspace
/usr/local/miniconda3/bin/python acr-engine/scripts/run_songcentric_directory_pipeline_live.py \
--dsn 'postgres://d2:d2pass@127.0.0.1:5432/d2' \
--schema acr_songcentric_test \
--input-root acr-engine/data/songcentric_builder_smoke \
--output-dir acr-engine/data/pgvector_eval/music20
```
当前 fresh evidence:
- `song_count = 2`
- `window_count = 5`
- `matcher_fingerprint_count = 5`
- `fallback_fingerprint_count = 0`
- `semantic_runtime_available = false`
- `import_counts.feature_fact = 24`
如果你当前目标是验证老的 Phase-1 planner/worker 合同,再跑下面这条:
```bash
cd /workspace/acr-engine
/usr/local/miniconda3/bin/python scripts/run_planner_validation_commands_live.py \
......