Document asset-type training policy before bulk internal ingestion
Constraint: Internal media types need a clear training whitelist and versioning policy before they are mapped into manifests and pgvector Rejected: Treat all audio-like assets as the same training label source | Would blur original-vs-instrumental semantics and degrade retrieval quality Confidence: high Scope-risk: narrow Directive: Keep original recordings, instrumental variants, and short-video clips explicitly separated by audio_role and version semantics during ingestion Tested: Verified new documentation anchors and mapping tables in training-data-and-pgvector-guide.md Not-tested: Automated import from the upstream SQL type enum into manifests
Showing
2 changed files
with
161 additions
and
0 deletions
-
Please register or sign in to post a comment