- 02 Jun, 2026 11 commits
-
-
Broaden external dataset bootstrap support and replace naive hard-case oversampling with a more targeted weighting signal that measurably helps humming-like queries while preserving the release/eval workflow. Constraint: Hard-case optimization must be evidence-driven and preserve a record of mixed outcomes across iterations Rejected: Reuse naive oversampling after regression | it already showed worse overall behavior with no hard-case gain Confidence: medium Scope-risk: moderate Directive: Next iteration should target confused-case negatives explicitly; do not assume humming gains transfer to confusion robustness Tested: bootstrap generation for MTG-Jamendo and ModelScope placeholders; 2-epoch CPU training for models_v5; index_v5 build; fast eval JSON generation for smoke-v5 Not-tested: real audio ingestion for the new datasets; full melody-aware slow evaluation on models_v5
cnb.bofCdSsphPA authored -
Extend the data ingress path with bootstrap manifests for real datasets and capture an unsuccessful hard-case oversampling experiment so future iterations can avoid repeating the same weak strategy. Constraint: Continuous optimization requires preserving negative results, not just successful ones Rejected: Drop the oversampling attempt without record | would lose evidence and encourage redoing the same low-yield change Confidence: high Scope-risk: moderate Directive: Next hard-case work should focus on melody-aware supervision and harder negatives instead of naive sample repetition Tested: bootstrap manifest generation for FMA and CCMusic; 2-epoch CPU training for models_v4; index_v4 build; fast eval JSON generation for smoke-v4 Not-tested: whitelisted real audio ingestion beyond placeholder manifests; full melody-aware slow-eval on models_v4
cnb.bofCdSsphPA authored -
Make the benchmark pipeline produce reusable release artifacts from actual evaluation results so model iterations can be tracked, reviewed, and shipped with evidence. Constraint: Continuous training only helps if each stage emits durable reports and release metadata Rejected: Keep artifact generation as a disconnected smoke utility | would block repeatable release discipline Confidence: high Scope-risk: moderate Directive: Next iterations should improve hard-case metrics on real/whitelisted datasets and keep artifact generation on every training milestone Tested: synthetic_v2 data regeneration; 2-epoch CPU training; index build; fast evaluation JSON export; artifact generation to reports/smoke-v2/synthetic_v2 Not-tested: full melody-aware slow evaluation as release default; real external dataset benchmark generation
cnb.bofCdSsphPA authored -
Turn the docs set into a layered documentation portal with navigation, source tracing, and reusable governance templates so the project can scale beyond ad hoc notes. Constraint: Industrialization requires documentation that supports decisions, traceability, and release discipline Rejected: Keep docs as isolated topical files without navigation or templates | would slow onboarding and weaken release governance Confidence: high Scope-risk: narrow Directive: Keep future docs in the executive-summary -> diagram -> table -> text -> appendix pattern with explicit Sources sections Tested: structural checks for core docs and templates; source-section checks; docs file-presence checks; service /config and /health smoke checks from earlier stage remain valid Not-tested: rendered markdown visuals in a browser; external publishing pipeline
cnb.bofCdSsphPA authored -
Prepare the prototype for industrial evolution by adding a service surface, external manifest conversion tools, and dataset adapter scaffolding with explicit licensing checkpoints. Constraint: Commercialization requires auditable data ingress and callable service boundaries, not just offline notebooks Rejected: Delay service and data-ingest work until after model perfection | would block end-to-end productization and ops readiness Confidence: medium Scope-risk: moderate Directive: Next stages should connect real whitelisted datasets, benchmark latency, and improve hard-case acceptance/rejection quality Tested: dataset adapter registry/describe/init commands; manifest csv-to-catalog; service health; service build_index; service recognize; train.py --dry-run Not-tested: live uvicorn deployment; external dataset downloads; ANN-backed production indexing
cnb.bofCdSsphPA authored -
cnb.bofCdSsphPA authored
-
Shift the prototype toward music-retrieval behavior by documenting dataset contracts, upgrading the frontend to 128-bin Mel plus band splitting, and adding retrieval evaluation plus harder confusion-oriented augmentation. Constraint: The previous pipeline mixed train splits with the searchable catalog and hid real retrieval quality Rejected: Keep classification-centric validation and whole-song averaged references | it masked structural accuracy failures Confidence: medium Scope-risk: moderate Directive: Next iterations should target humming/confused top1 with specialized melody-aware retrieval and stronger real-data calibration Tested: synthetic_v2 generation; 3-epoch CPU training; index build; evaluate.py top1=0.65 top5=0.95 on test split Not-tested: external open-dataset ingestion; foundation-model baselines; production latency
cnb.bofCdSsphPA authored -
cnb.bofCdSsphPA authored
-
Add missing project documentation and a minimal executable demo flow so the repository can be understood and validated end to end. Constraint: The existing repo had design fragments but no verified runnable path Rejected: Delay documentation until after full productization | would keep scope opaque and slow iteration Confidence: medium Scope-risk: moderate Directive: Keep future stages checkpointed with changelog entries and runnable verification commands Tested: synthetic dataset generation; train.py --dry-run; 1 epoch CPU training; index build; recognition JSON output Not-tested: production-scale retrieval; real copyrighted audio; API serving
cnb.bofCdSsphPA authored -
cnb.bofCdSsphPA authored
-
cnb.bofCdSsphPA authored
-