Bridge internal CSV exports into manifest bundles before ingestion at scale
Constraint: Internal asset exports should reach train/test-ready manifests without repeated manual reshaping Rejected: Stop at references/queries JSON only | Still leaves each import needing custom bundle assembly and split logic Confidence: high Scope-risk: narrow Directive: Keep internal manifest emission conservative and deterministic; preserve train/test query presence even on tiny exports Tested: internal_asset_type_mapper.py sample run with --emit-manifests produced catalog/train/test/val and balanced 1 query in both train and test Not-tested: Duration/offset enrichment from live source metadata and audio-path existence checks on production exports
Showing
3 changed files
with
118 additions
and
8 deletions
-
Please register or sign in to post a comment