Add a single-page open dataset workflow for training prep
Constraint: Open-dataset onboarding needed one short executable path instead of scattered instructions across many docs Rejected: Leave ingestion knowledge split across multiple pages only | Raises setup friction before real FMA or MTG-Jamendo training Confidence: high Scope-risk: narrow Directive: Use the single-page workflow as the default operator path before adding more open-dataset sources Tested: /usr/local/miniconda3/bin/python src/data/external_adapters.py inspect-local fma data/synthetic_v2/songs --eval-ratio 0.2 --query-duration 5.0; /usr/local/miniconda3/bin/python src/data/external_adapters.py prepare-local fma data/synthetic_v2/songs --output-root data/external_ingested/synthetic_as_open --eval-ratio 0.2 --query-duration 5.0; /usr/local/miniconda3/bin/python src/data/external_adapters.py validate-local fma data/external_ingested/synthetic_as_open/fma/manifests Not-tested: Real FMA or MTG-Jamendo local download directories
Showing
8 changed files
with
842 additions
and
0 deletions
docs/open-dataset-workflow.md
0 → 100644
-
Please register or sign in to post a comment