1. 02 Jun, 2026 40 commits
    • Constraint: Once the large FMA archive finishes, future sessions should not need to manually stitch extraction and readiness checks together
      Rejected: Leave post-download steps as manual shell sequences | Increases delay and error risk at the most valuable transition point
      Confidence: high
      Scope-risk: narrow
      Directive: Keep fma_postdownload_ready.py as the canonical first command after archive completion before attempting real-data smoke runs
      Tested: /usr/local/miniconda3/bin/python -m py_compile acr-engine/scripts/fma_postdownload_ready.py; /usr/local/miniconda3/bin/python acr-engine/scripts/fma_postdownload_ready.py
      Not-tested: Successful extract and readiness on the full archive remain pending completion of the download
      cnb.bofCdSsphPA authored
    • Constraint: Schema and manifest-export templates are useful, but practical adoption still needs an explicit handoff into database load order and SQL shapes
      Rejected: Stop at export JSON only | Leaves later sessions to redesign the bulk-ingest bridge from scratch
      Confidence: high
      Scope-risk: narrow
      Directive: Keep bulk-load templates declarative until a real database target is available, then add a live loader without changing manifest semantics
      Tested: /usr/local/miniconda3/bin/python -m py_compile acr-engine/scripts/pgvector_bulk_load_template.py; /usr/local/miniconda3/bin/python acr-engine/scripts/pgvector_bulk_load_template.py --input acr-engine/reports/pgvector_manifest_export_test.json --output acr-engine/reports/pgvector_bulk_load_plan_test.json
      Not-tested: Live PostgreSQL execution remains pending a database environment
      cnb.bofCdSsphPA authored
    • Constraint: The user needs concrete downstream data handling guidance now, and future vector retrieval work should not start from abstract docs alone
      Rejected: Leave pgvector support at prose-only guidance | Delays integration by forcing later sessions to reinvent schema and export bridges
      Confidence: high
      Scope-risk: narrow
      Directive: Keep schema/export templates aligned with actual manifest semantics before adding live database loaders
      Tested: /usr/local/miniconda3/bin/python -m py_compile acr-engine/scripts/export_manifest_to_pgvector_json.py; /usr/local/miniconda3/bin/python acr-engine/scripts/export_manifest_to_pgvector_json.py --data acr-engine/data/synthetic_v2 --split test --source-dataset synthetic_v2 --output acr-engine/reports/pgvector_manifest_export_test.json
      Not-tested: Live PostgreSQL/pgvector ingestion remains pending a real database target
      cnb.bofCdSsphPA authored
    • Constraint: The user needs detailed data-format guidance now, while the real FMA archive transfer still requires durable hands-off supervision across long sessions
      Rejected: Treat documentation and download-watch work as separate later tasks | Would leave either user guidance or transfer resilience lagging behind active development
      Confidence: high
      Scope-risk: narrow
      Directive: Keep the new training-data/pgvector guide aligned with actual manifest fields and use watch_fma_download.py as the first-line long-transfer watchdog
      Tested: /usr/local/miniconda3/bin/python -m py_compile acr-engine/scripts/watch_fma_download.py; /usr/local/miniconda3/bin/python acr-engine/scripts/watch_fma_download.py --cycles 2 --interval 2; /usr/local/miniconda3/bin/python acr-engine/scripts/prepare_fma_archive.py inspect
      Not-tested: Full archive completion, extraction, and real-data smoke remain pending
      cnb.bofCdSsphPA authored
    • Constraint: Long FMA archive downloads cannot rely on fragile foreground execution if Ralph-style work must continue across sessions
      Rejected: Keep manually reissuing foreground download commands after stalls | Increases interruption risk and weakens resumability evidence
      Confidence: high
      Scope-risk: narrow
      Directive: Prefer prepare_fma_archive.py bg-download for future large archive recovery so PID and log evidence remain standardized
      Tested: /usr/local/miniconda3/bin/python acr-engine/scripts/prepare_fma_archive.py bg-download; /usr/local/miniconda3/bin/python acr-engine/scripts/prepare_fma_archive.py inspect; tail -n 40 /tmp/fma_modelscope_download.log
      Not-tested: Full archive completion, extraction, and real-data smoke remain pending
      cnb.bofCdSsphPA authored
    • Constraint: Multi-session continuation gets brittle when large real-data downloads require manual byte math to estimate progress
      Rejected: Leave inspect output as raw archive size only | Forces every future session to recalculate completion state by hand
      Confidence: high
      Scope-risk: narrow
      Directive: Keep progress fields stable so handoff tooling and humans can rely on them during long archive transfers
      Tested: /usr/local/miniconda3/bin/python -m py_compile acr-engine/scripts/prepare_fma_archive.py; /usr/local/miniconda3/bin/python acr-engine/scripts/prepare_fma_archive.py inspect
      Not-tested: Completion of the full archive and downstream extraction remain pending
      cnb.bofCdSsphPA authored
    • Constraint: The user supplied a verified archive URL that is a better current source of truth than the previously tested mirror path
      Rejected: Keep the older archive URL as the default control surface | Would ignore fresher user evidence and split operational guidance across sources
      Confidence: high
      Scope-risk: narrow
      Directive: Treat the ModelScope FMA archive URL as the primary default until a newer verified source supersedes it
      Tested: curl -I -L --max-time 60 https://modelscope.cn/datasets/pengzhendong/fma/resolve/master/fma_small.zip; curl -L --range 0-1023 --max-time 60 -o /tmp/fma_modelscope_probe.bin https://modelscope.cn/datasets/pengzhendong/fma/resolve/master/fma_small.zip; /usr/local/miniconda3/bin/python acr-engine/scripts/prepare_fma_archive.py inspect
      Not-tested: Full archive completion, extraction, and downstream real-data smoke remain pending
      cnb.bofCdSsphPA authored
    • Constraint: A service intended for industrialization needs a real process-level smoke test, not only direct function imports
      Rejected: Rely on unit-style handler calls alone | Misses uvicorn startup and actual HTTP surface regressions
      Confidence: high
      Scope-risk: narrow
      Directive: Keep service_smoke.py lightweight and dependency-free so it remains the fastest operational gate before broader API expansion
      Tested: /usr/local/miniconda3/bin/python -m py_compile acr-engine/scripts/service_smoke.py; /usr/local/miniconda3/bin/python acr-engine/scripts/service_smoke.py
      Not-tested: /recognize and /index/build over HTTP remain pending dedicated API smoke inputs
      cnb.bofCdSsphPA authored
    • Constraint: Industrializing the service path requires visibility into model/index availability and repeated-load behavior before adding heavier production features
      Rejected: Keep stateless per-request loading until later | Hides readiness problems and wastes time on repeated engine initialization
      Confidence: high
      Scope-risk: narrow
      Directive: Preserve /ready and /cache as low-cost operational probes even if the serving stack evolves behind them
      Tested: /usr/local/miniconda3/bin/python -m py_compile acr-engine/src/service/app.py; /usr/local/miniconda3/bin/python /tmp/test_service_readiness.py; /usr/local/miniconda3/bin/python /tmp/test_service_cache.py
      Not-tested: Live FastAPI HTTP serving and concurrent request behavior remain pending
      cnb.bofCdSsphPA authored
    • Constraint: The verified FMA archive is multi-gigabyte and downloads slowly, so the workflow must remain inspectable and resumable before extraction can happen
      Rejected: Depend on ad hoc curl and unzip commands only | Makes long-running handoff and recovery brittle during Ralph-style continuous execution
      Confidence: high
      Scope-risk: narrow
      Directive: Keep official FMA archive acquisition centered on prepare_fma_archive.py so future sessions share one resumable control surface
      Tested: /usr/local/miniconda3/bin/python -m py_compile acr-engine/scripts/prepare_fma_archive.py; /usr/local/miniconda3/bin/python acr-engine/scripts/prepare_fma_archive.py inspect; unzip -v | head -n 2
      Not-tested: Archive extraction and real-data smoke remain pending completion of the full fma_small.zip download
      cnb.bofCdSsphPA authored
    • Constraint: Real-data progress was blocked until we could prove an upstream archive path that still works today
      Rejected: Continue iterating on historical per-track URLs | Those paths already proved unstable via 403 and 404 evidence
      Confidence: high
      Scope-risk: narrow
      Directive: Prefer the verified fma_small.zip archive route over legacy page or single-track scraping paths unless upstream changes again
      Tested: curl -I -L --max-time 60 https://os.unil.cloud.switch.ch/fma/fma_small.zip; curl -L --range 0-1023 --max-time 60 -o /tmp/fma_small_probe.bin https://os.unil.cloud.switch.ch/fma/fma_small.zip
      Not-tested: Full 7.68 GB archive download, extraction, and smoke execution remain pending
      cnb.bofCdSsphPA authored
    • Constraint: Real-data progress requires proving whether failures come from our environment or from changed upstream access paths
      Rejected: Keep treating the fetch blocker as a missing-tool problem | Would misdirect future debugging after yt-dlp module support was verified
      Confidence: high
      Scope-risk: narrow
      Directive: Do not retry historical FMA page URLs again unless a fresh source confirms their return; pivot to official archives or stable mirrors instead
      Tested: which yt-dlp || true; /usr/local/miniconda3/bin/python -m yt_dlp --version; /usr/local/miniconda3/bin/python -m py_compile acr-engine/scripts/fetch_fma_subset.py; /usr/local/miniconda3/bin/python acr-engine/scripts/fetch_fma_subset.py --report acr-engine/reports/fma_fetch_subset_report.json
      Not-tested: Successful real FMA download still pending a valid upstream archive or mirror URL
      cnb.bofCdSsphPA authored
    • Constraint: Continuous dataset landing work needs concrete failed-path evidence so future sessions do not restart from outdated assumptions
      Rejected: Omit the failed download automation because it did not complete | Loses reproducible evidence about the current 403 and missing-tool barriers
      Confidence: high
      Scope-risk: narrow
      Directive: Replace this bounded fetch path only after verifying a stable official archive or mirror-based download route
      Tested: /usr/local/miniconda3/bin/python -m py_compile acr-engine/scripts/fetch_fma_subset.py; /usr/local/miniconda3/bin/python acr-engine/scripts/fetch_fma_subset.py --report acr-engine/reports/fma_fetch_subset_report.json
      Not-tested: Successful real FMA audio download remains blocked by current upstream/tooling availability
      cnb.bofCdSsphPA authored
    • Constraint: The user wants real datasets added locally and potentially pushed, which would make ordinary git history fragile without LFS guardrails
      Rejected: Download first and retrofit tracking later | Risks oversized commits and inconsistent reproducibility rules
      Confidence: high
      Scope-risk: narrow
      Directive: Route all future raw corpus archives and audio under acr-engine/data/raw through LFS unless a smaller manifest-only alternative is explicitly chosen
      Tested: git lfs version; git check-attr filter -- acr-engine/data/raw/fma_small_audio/example.wav; git check-attr filter -- acr-engine/data/raw/archive.zip
      Not-tested: Actual large-file add/push against remote LFS storage remains pending until real dataset files are downloaded
      cnb.bofCdSsphPA authored
    • Constraint: Real-data validation now depends on user-requested local corpus drop zones that may exist before they contain any audio
      Rejected: Let smoke-local fail deep inside training | Produces slower and less actionable feedback for continuous sessions
      Confidence: high
      Scope-risk: narrow
      Directive: Keep readiness thresholds aligned with the minimum viable query split assumptions before expanding real-data automation
      Tested: /usr/local/miniconda3/bin/python -m py_compile src/data/external_adapters.py scripts/status_snapshot.py; /usr/local/miniconda3/bin/python src/data/external_adapters.py check-local-ready fma data/raw/fma_small_audio --eval-ratio 0.2 --query-duration 8.0; /usr/local/miniconda3/bin/python src/data/external_adapters.py check-local-ready mtg_jamendo data/raw/mtg_jamendo_audio --eval-ratio 0.2 --query-duration 8.0; /usr/local/miniconda3/bin/python scripts/status_snapshot.py --output .omx/latest_status_snapshot.json
      Not-tested: Full smoke-local on real FMA or MTG-Jamendo remains blocked until audio is actually downloaded
      cnb.bofCdSsphPA authored
    • Constraint: Ongoing Ralph-style handoff requires new sessions to distinguish finished capability from smoke-only scaffolding quickly
      Rejected: Leave capability status implicit in scattered docs | Increases onboarding ambiguity and status misreads
      Confidence: high
      Scope-risk: narrow
      Directive: Update this map whenever a smoke path becomes real-data validated or a regression invalidates a claimed capability
      Tested: Verified docs/current-capability-map.md exists and is linked from docs/README.md and docs/session-handoff.md
      Not-tested: Semantic accuracy against future real-dataset runs remains pending
      cnb.bofCdSsphPA authored
    • Constraint: Future sessions benefit from a saved machine-readable snapshot, not just on-demand script output
      Rejected: Keep snapshot stdout-only | Makes handoff less durable and harder to automate across sessions
      Confidence: high
      Scope-risk: narrow
      Directive: Refresh .omx/latest_status_snapshot.json whenever the default docs, smoke paths, or next-step commands materially change
      Tested: /usr/local/miniconda3/bin/python scripts/status_snapshot.py --output .omx/latest_status_snapshot.json; JSON parse check for latest_commit and next_commands
      Not-tested: External automation consuming the saved snapshot over multiple sessions
      cnb.bofCdSsphPA authored
    • Constraint: Future sessions need a quick machine-readable summary of the verified repo state and next commands
      Rejected: Depend on manual reconstruction from docs and git history alone | Slower and more error-prone during handoff
      Confidence: high
      Scope-risk: narrow
      Directive: Keep the snapshot script aligned with the real default docs, drop zones, smoke outputs, and next-step commands
      Tested: /usr/local/miniconda3/bin/python scripts/status_snapshot.py
      Not-tested: Consumption of the snapshot by external automation beyond manual review
      cnb.bofCdSsphPA authored
    • Constraint: New sessions need a minimal startup checklist so they can verify repo health and resume development quickly
      Rejected: Keep startup knowledge implicit in long docs only | Increases ramp-up time and the chance of missing key checks
      Confidence: high
      Scope-risk: narrow
      Directive: Update this checklist whenever the default startup workflow or open-dataset commands materially change
      Tested: existence checks for acr-engine/FIRST_RUN_CHECKLIST.md, docs/README.md, docs/session-handoff.md, plus docs link-presence checks
      Not-tested: Human walkthrough of the full checklist from a fresh shell
      cnb.bofCdSsphPA authored
    • Constraint: New sessions need a fast, durable understanding of the project state, open-dataset workflow, verified evidence, and next steps
      Rejected: Rely on scattered docs and git history alone | Too slow for session handoff and easy to miss critical workflow context
      Confidence: high
      Scope-risk: narrow
      Directive: Keep this handoff doc updated whenever a major workflow milestone or verified capability changes
      Tested: existence checks for docs/session-handoff.md and docs/README.md, plus docs index link presence
      Not-tested: Manual human review across multiple markdown renderers
      cnb.bofCdSsphPA authored
    • Constraint: Replacing the synthetic stand-in with real FMA or MTG-Jamendo data should not require users to infer directory structure
      Rejected: Leave only generic workflow text | Still forces users to guess where local audio should live before smoke runs
      Confidence: high
      Scope-risk: narrow
      Directive: Keep future real-corpus onboarding anchored to data/raw drop zones and smoke-local commands
      Tested: filesystem existence checks for acr-engine/data/raw/fma_small_audio, acr-engine/data/raw/mtg_jamendo_audio, acr-engine/data/raw/README.md, docs/README.md, docs/open-dataset-workflow.md, acr-engine/data/external_ingested/README.md
      Not-tested: Real downloaded audio placed into the new drop zones
      cnb.bofCdSsphPA authored
    • Constraint: Real FMA or MTG-Jamendo onboarding should require only an input directory change, not a long manual command chain
      Rejected: Keep the smoke steps separate only | Slows repeated validation and increases operator error risk
      Confidence: high
      Scope-risk: moderate
      Directive: Use smoke-local as the default first-pass validation path for every new local open-music corpus
      Tested: /usr/local/miniconda3/bin/python src/data/external_adapters.py smoke-local fma data/synthetic_v2/songs --output-root data/external_smoke --eval-ratio 0.2 --query-duration 5.0 --train-epochs 1 --batch-size 2; /usr/local/miniconda3/bin/python -m py_compile src/data/external_adapters.py src/data/manifest_tools.py train.py run_demo.py evaluate.py scripts/generate_artifacts.py
      Not-tested: Real downloaded FMA or MTG-Jamendo directories on larger-scale smoke runs
      cnb.bofCdSsphPA authored
    • Constraint: Open-dataset workflow needed the same reporting/release outputs as the synthetic baseline to be operationally useful
      Rejected: Treat open-data smoke as a one-off test only | Leaves no reusable benchmark or release documentation trail
      Confidence: high
      Scope-risk: narrow
      Directive: Every future real-dataset smoke run should emit eval JSON plus artifact bundle in the same directory
      Tested: /usr/local/miniconda3/bin/python scripts/generate_artifacts.py --eval-json reports/open-smoke-fixed/fma/eval.json --config-json reports/open-smoke-fixed/fma/config.json --output-dir reports/open-smoke-fixed/fma --model-version open-smoke-fixed --data-version synthetic_as_open_fixed_fma
      Not-tested: Artifact generation on a larger real downloaded corpus with multiple hard-case buckets
      cnb.bofCdSsphPA authored
    • Constraint: Open-dataset support was not complete until imported corpora could train, build indexes, and produce eval outputs without manual path surgery
      Rejected: Stop at train.py dry-run | Does not prove the retrieval/evaluation half of the workflow actually works
      Confidence: high
      Scope-risk: moderate
      Directive: Keep future external dataset layouts self-contained and manifests-root aware across training, indexing, and evaluation paths
      Tested: /usr/local/miniconda3/bin/python train.py --data data/external_ingested/synthetic_as_open_fixed/fma/manifests --output data/models_open_smoke_fixed --device cpu --epochs 1 --batch-size 2; /usr/local/miniconda3/bin/python run_demo.py build-index --data data/external_ingested/synthetic_as_open_fixed/fma/manifests --model data/models_open_smoke_fixed/best_model.pt --output data/index_open_smoke_fixed --device cpu; /usr/local/miniconda3/bin/python evaluate.py --data data/external_ingested/synthetic_as_open_fixed/fma/manifests --model data/models_open_smoke_fixed/best_model.pt --index-prefix data/index_open_smoke_fixed/reference --split test --device cpu --fast-eval --output-json reports/open-smoke-fixed/fma/eval.json; /usr/local/miniconda3/bin/python -m py_compile evaluate.py run_demo.py src/engines/ecapa_embedder.py src/engines/chromaprint_matcher.py src/data/dataset.py src/data/manifest_tools.py src/data/external_adapters.py train.py
      Not-tested: Real downloaded FMA or MTG-Jamendo corpora at larger scale
      cnb.bofCdSsphPA authored
    • Constraint: Open dataset onboarding was incomplete until generated manifests could enter train.py without manual path fixes
      Rejected: Keep manifests as ingestion-only artifacts | Fails the actual training handoff and leaves the workflow broken
      Confidence: high
      Scope-risk: moderate
      Directive: Preserve the self-contained output layout (audio plus manifests) for all future external dataset imports
      Tested: /usr/local/miniconda3/bin/python src/data/external_adapters.py prepare-local fma data/synthetic_v2/songs --output-root data/external_ingested/synthetic_as_open_fixed --eval-ratio 0.2 --query-duration 5.0; /usr/local/miniconda3/bin/python src/data/external_adapters.py validate-local fma data/external_ingested/synthetic_as_open_fixed/fma/manifests; /usr/local/miniconda3/bin/python train.py --data data/external_ingested/synthetic_as_open_fixed/fma/manifests --output data/models_open_smoke_fixed --device cpu --epochs 1 --batch-size 2 --dry-run; /usr/local/miniconda3/bin/python -m py_compile src/data/dataset.py train.py src/data/manifest_tools.py src/data/external_adapters.py
      Not-tested: Full multi-epoch training and index/eval loop on a real downloaded FMA or MTG-Jamendo corpus
      cnb.bofCdSsphPA authored
    • Constraint: Open-dataset onboarding needed one short executable path instead of scattered instructions across many docs
      Rejected: Leave ingestion knowledge split across multiple pages only | Raises setup friction before real FMA or MTG-Jamendo training
      Confidence: high
      Scope-risk: narrow
      Directive: Use the single-page workflow as the default operator path before adding more open-dataset sources
      Tested: /usr/local/miniconda3/bin/python src/data/external_adapters.py inspect-local fma data/synthetic_v2/songs --eval-ratio 0.2 --query-duration 5.0; /usr/local/miniconda3/bin/python src/data/external_adapters.py prepare-local fma data/synthetic_v2/songs --output-root data/external_ingested/synthetic_as_open --eval-ratio 0.2 --query-duration 5.0; /usr/local/miniconda3/bin/python src/data/external_adapters.py validate-local fma data/external_ingested/synthetic_as_open/fma/manifests
      Not-tested: Real FMA or MTG-Jamendo local download directories
      cnb.bofCdSsphPA authored
    • Constraint: Readers need fewer entry documents and clickable relative links before scaling open-dataset usage
      Rejected: Keep expanding flat documentation pages | Increases navigation cost and hides the main execution path
      Confidence: high
      Scope-risk: moderate
      Directive: Route future dataset operations through inspect-local/inspect-batch/prepare-local/validate-local and keep docs grouped by role
      Tested: /usr/local/miniconda3/bin/python -m py_compile src/data/manifest_tools.py src/data/external_adapters.py; /usr/local/miniconda3/bin/python src/data/manifest_tools.py validate-splits data/external_ingested/demo_via_adapter/fma/manifests; /usr/local/miniconda3/bin/python src/data/external_adapters.py validate-local fma data/external_ingested/demo_via_adapter/fma/manifests; python3 targeted-doc-link scan over docs/README.md docs/dataset-spec.md docs/dataset-sources-and-licensing.md docs/industrialization-roadmap.md docs/service-api.md docs/industrial-benchmark-spec.md acr-engine/data/external_ingested/README.md
      Not-tested: Real browser/rendered markdown click-through behavior across every client
      cnb.bofCdSsphPA authored
    • Constraint: Personal-use dataset preparation needs fast comparison across several local open-music corpora before ingestion
      Rejected: Inspect each dataset directory manually one by one | Slows repeated train/eval setup and comparison
      Confidence: high
      Scope-risk: narrow
      Directive: Use inspect-batch on real FMA and MTG-Jamendo folders before selecting training and held-out evaluation corpora
      Tested: /usr/local/miniconda3/bin/python -m py_compile src/data/external_adapters.py src/data/manifest_tools.py; /usr/local/miniconda3/bin/python src/data/external_adapters.py inspect-batch fma=tmp/open_music_demo_fma mtg_jamendo=tmp/open_music_demo_jamendo --eval-ratio 0.5 --query-duration 5.0
      Not-tested: Real upstream corpus inventory on downloaded full-size open datasets
      cnb.bofCdSsphPA authored
    • Constraint: Personal-use dataset setup needs quick scale visibility before generating train/eval manifests
      Rejected: Generate splits blindly | Hides whether a local corpus is large enough for meaningful train/test separation
      Confidence: high
      Scope-risk: narrow
      Directive: Run inspect-local on real FMA or MTG-Jamendo folders before prepare-local and training
      Tested: /usr/local/miniconda3/bin/python -m py_compile src/data/manifest_tools.py src/data/external_adapters.py; /usr/local/miniconda3/bin/python src/data/manifest_tools.py inspect-audio-dir tmp/open_music_demo --query-duration 5.0 --eval-ratio 0.5; /usr/local/miniconda3/bin/python src/data/external_adapters.py inspect-local fma tmp/open_music_demo --eval-ratio 0.5 --query-duration 5.0
      Not-tested: Real large external corpus inventory on downloaded FMA or MTG-Jamendo directories
      cnb.bofCdSsphPA authored
    • Constraint: Personal-use experimentation needs a single entrypoint from local open-audio directories to train/eval manifests
      Rejected: Separate manual manifest generation per dataset | Too error-prone and slows iterative training/evaluation
      Confidence: high
      Scope-risk: narrow
      Directive: Point real FMA or MTG-Jamendo local download folders at prepare-local before expanding training runs
      Tested: /usr/local/miniconda3/bin/python -m py_compile src/data/external_adapters.py src/data/manifest_tools.py; /usr/local/miniconda3/bin/python src/data/external_adapters.py prepare-local fma tmp/open_music_demo --output-root data/external_ingested/demo_via_adapter --eval-ratio 0.5 --query-duration 5.0
      Not-tested: Full upstream corpus import and large-scale training
      cnb.bofCdSsphPA authored
    • Constraint: Personal-use workflow needs real train/eval manifests rather than bootstrap-only placeholders
      Rejected: Keep external datasets as catalog skeletons only | Does not satisfy training/evaluation reuse requirement
      Confidence: high
      Scope-risk: narrow
      Directive: Wire real FMA or MTG-Jamendo local download directories into this ingestion path before larger-scale training
      Tested: /usr/local/miniconda3/bin/python -m py_compile src/data/manifest_tools.py; /usr/local/miniconda3/bin/python src/data/manifest_tools.py audio-dir-to-splits tmp/open_music_demo data/external_ingested/demo_fma_like --source-dataset demo_fma_like --eval-ratio 0.5 --query-duration 5.0
      Not-tested: Full download/import of upstream FMA or MTG-Jamendo corpora
      cnb.bofCdSsphPA authored
    • Constraint: Need fresh, like-for-like evidence on stable v6 assets before changing defaults
      Rejected: More training-weight tuning | v7 and v8 regressed hard-case and overall accuracy
      Confidence: high
      Scope-risk: narrow
      Directive: Use open datasets as separate train/eval assets and tune fusion on held-out eval manifests before retraining
      Tested: /usr/local/miniconda3/bin/python -m py_compile evaluate.py; /usr/local/miniconda3/bin/python evaluate.py --data data/synthetic_v2 --model data/models_v6/best_model.pt --index-prefix data/index_v6/reference --split test --device cpu --fast-eval; /usr/local/miniconda3/bin/python evaluate.py --data data/synthetic_v2 --model data/models_v6/best_model.pt --index-prefix data/index_v6/reference --split test --device cpu --fast-eval --chroma-weight 0.2 --ecapa-weight 0.55 --melody-weight 0.25 --output-json reports/smoke-v6/synthetic_v2/eval-fusion-tuned.json
      Not-tested: Full melody-enabled sweep across multiple weight grids and real external datasets
      cnb.bofCdSsphPA authored
    • Constraint: Must preserve runnable pipeline and record stage evidence before continuing optimization
      Rejected: More naive oversampling | Regressed overall and hard-case accuracy in smoke-v4
      Confidence: medium
      Scope-risk: moderate
      Directive: Treat confused and humming_like as separate optimization lanes in future stages
      Tested: /usr/local/miniconda3/bin/python train.py --data data/synthetic_v2 --output data/models_v6 --device cpu --epochs 1 --batch-size 6 --dry-run; /usr/local/miniconda3/bin/python -m py_compile train.py src/models/losses.py src/data/dataset.py; /usr/local/miniconda3/bin/python train.py --data data/synthetic_v2 --output data/models_v6 --device cpu --epochs 2 --batch-size 6; /usr/local/miniconda3/bin/python run_demo.py build-index --data data/synthetic_v2 --model data/models_v6/best_model.pt --output data/index_v6 --device cpu; /usr/local/miniconda3/bin/python evaluate.py --data data/synthetic_v2 --model data/models_v6/best_model.pt --index-prefix data/index_v6/reference --split test --device cpu --fast-eval --output-json reports/smoke-v6/synthetic_v2/eval.json; /usr/local/miniconda3/bin/python scripts/generate_artifacts.py --eval-json reports/smoke-v6/synthetic_v2/eval.json --config-json reports/smoke-v6/synthetic_v2/config.json --output-dir reports/smoke-v6/synthetic_v2 --model-version smoke-v6 --data-version synthetic_v2
      Not-tested: Real external dataset training run and GPU-scale convergence
      cnb.bofCdSsphPA authored
    • Broaden external dataset bootstrap support and replace naive hard-case oversampling with a more targeted weighting signal that measurably helps humming-like queries while preserving the release/eval workflow.
      
      Constraint: Hard-case optimization must be evidence-driven and preserve a record of mixed outcomes across iterations
      Rejected: Reuse naive oversampling after regression | it already showed worse overall behavior with no hard-case gain
      Confidence: medium
      Scope-risk: moderate
      Directive: Next iteration should target confused-case negatives explicitly; do not assume humming gains transfer to confusion robustness
      Tested: bootstrap generation for MTG-Jamendo and ModelScope placeholders; 2-epoch CPU training for models_v5; index_v5 build; fast eval JSON generation for smoke-v5
      Not-tested: real audio ingestion for the new datasets; full melody-aware slow evaluation on models_v5
      cnb.bofCdSsphPA authored
    • Extend the data ingress path with bootstrap manifests for real datasets and capture an unsuccessful hard-case oversampling experiment so future iterations can avoid repeating the same weak strategy.
      
      Constraint: Continuous optimization requires preserving negative results, not just successful ones
      Rejected: Drop the oversampling attempt without record | would lose evidence and encourage redoing the same low-yield change
      Confidence: high
      Scope-risk: moderate
      Directive: Next hard-case work should focus on melody-aware supervision and harder negatives instead of naive sample repetition
      Tested: bootstrap manifest generation for FMA and CCMusic; 2-epoch CPU training for models_v4; index_v4 build; fast eval JSON generation for smoke-v4
      Not-tested: whitelisted real audio ingestion beyond placeholder manifests; full melody-aware slow-eval on models_v4
      cnb.bofCdSsphPA authored
    • Make the benchmark pipeline produce reusable release artifacts from actual evaluation results so model iterations can be tracked, reviewed, and shipped with evidence.
      
      Constraint: Continuous training only helps if each stage emits durable reports and release metadata
      Rejected: Keep artifact generation as a disconnected smoke utility | would block repeatable release discipline
      Confidence: high
      Scope-risk: moderate
      Directive: Next iterations should improve hard-case metrics on real/whitelisted datasets and keep artifact generation on every training milestone
      Tested: synthetic_v2 data regeneration; 2-epoch CPU training; index build; fast evaluation JSON export; artifact generation to reports/smoke-v2/synthetic_v2
      Not-tested: full melody-aware slow evaluation as release default; real external dataset benchmark generation
      cnb.bofCdSsphPA authored
    • Turn the docs set into a layered documentation portal with navigation, source tracing, and reusable governance templates so the project can scale beyond ad hoc notes.
      
      Constraint: Industrialization requires documentation that supports decisions, traceability, and release discipline
      Rejected: Keep docs as isolated topical files without navigation or templates | would slow onboarding and weaken release governance
      Confidence: high
      Scope-risk: narrow
      Directive: Keep future docs in the executive-summary -> diagram -> table -> text -> appendix pattern with explicit Sources sections
      Tested: structural checks for core docs and templates; source-section checks; docs file-presence checks; service /config and /health smoke checks from earlier stage remain valid
      Not-tested: rendered markdown visuals in a browser; external publishing pipeline
      cnb.bofCdSsphPA authored
    • Prepare the prototype for industrial evolution by adding a service surface, external manifest conversion tools, and dataset adapter scaffolding with explicit licensing checkpoints.
      
      Constraint: Commercialization requires auditable data ingress and callable service boundaries, not just offline notebooks
      Rejected: Delay service and data-ingest work until after model perfection | would block end-to-end productization and ops readiness
      Confidence: medium
      Scope-risk: moderate
      Directive: Next stages should connect real whitelisted datasets, benchmark latency, and improve hard-case acceptance/rejection quality
      Tested: dataset adapter registry/describe/init commands; manifest csv-to-catalog; service health; service build_index; service recognize; train.py --dry-run
      Not-tested: live uvicorn deployment; external dataset downloads; ANN-backed production indexing
      cnb.bofCdSsphPA authored
    • add src · 31a72045
      cnb.bofCdSsphPA authored
    • Shift the prototype toward music-retrieval behavior by documenting dataset contracts, upgrading the frontend to 128-bin Mel plus band splitting, and adding retrieval evaluation plus harder confusion-oriented augmentation.
      
      Constraint: The previous pipeline mixed train splits with the searchable catalog and hid real retrieval quality
      Rejected: Keep classification-centric validation and whole-song averaged references | it masked structural accuracy failures
      Confidence: medium
      Scope-risk: moderate
      Directive: Next iterations should target humming/confused top1 with specialized melody-aware retrieval and stronger real-data calibration
      Tested: synthetic_v2 generation; 3-epoch CPU training; index build; evaluate.py top1=0.65 top5=0.95 on test split
      Not-tested: external open-dataset ingestion; foundation-model baselines; production latency
      cnb.bofCdSsphPA authored