Skip to content
99ersstudio
All work
Pipelines/Live

Music Producing Pipeline

Nine-stage AI music + video + publish pipeline for YouTube. Two genre families, 892 tests, schema-validated artifacts, CLI-first.

892
tests
2
genre families
9
pipeline stages

The problem

Independent AI-music YouTube channels live and die on consistent output: render a longform mix, slice five Shorts, generate a Spotify Canvas, write SEO-tight titles + tags + hashtags, schedule the upload, post the cross-channel teasers, then do it again next Friday. Doing that manually is a part-time job. The existing creator tools cover one slice each — a renderer, a metadata writer, a scheduler — and assume you will glue them together with Zapier. We wanted one CLI / GUI that owns the whole job and refuses to publish anything that fails an originality check.

How we built it

  1. 01Modeled the full job as nine pure stages (`ingest → analyze → route → guard → generate → rank → render → upload_prep → publish`). Every stage takes JSON in, emits JSON out, and is validated against one of 12+ schemas in `contracts/schemas.py` (`additionalProperties: false`). Every stage supports `--dry-run`. Jobs are resumable via `RunState` tracking.
  2. 02Made `genre_family` a first-class parameter so routing, SEO, rendering, and packaging never hardcode a genre. Two families ship today — melodic deep house and lofi hip hop — each with five subroutes (`night_drive`, `sunset`, `emotional`, `atmospheric`, `clubby` for the first; `study_chill`, `jazzhop`, `rainy_night`, `tape_nostalgia`, `boombox` for the second). A third family is a config + prompt drop-in.
  3. 03Built an originality guard that blocks artist references, song titles, and performer cloning before generation, and a publishing approval gate that requires `--approve` to bypass. FFmpeg drawtext is sanitised against injection (`\ ' : ; % \n` escaped). Job IDs and artist IDs are validated against path traversal.
  4. 04Added BPM-synced video polish: librosa-derived RMS pulse curves drive zoom + brightness, five genre presets per family with their own colour grading and xfade transition pools, segment merging at 8 s+ minimum so nothing feels choppy, optional Demucs stem separation with per-stem effects (drums → zoom, bass → vignette, vocals → brightness), and chorus-aware Shorts selection (chroma self-similarity → energy → heuristic fallback).
  5. 05Wrapped the long tail: DALL-E 3 + Pillow thumbnails (3 variants), Anthropic / OpenAI LLM SEO with rule-based fallback, DeepL / Google auto-translation in 6 languages, ElevenLabs voice intros / outros, Dolby.io mastering with FFmpeg `loudnorm` (EBU R128) fallback, Runway Gen-4 image-to-video with FFmpeg zoom fallback, Telegram bot for remote audio submission, album release scheduler with Friday alignment, smart-link landing pages, Mailchimp campaigns, EPK generator, content calendar via Buffer.

Outcome

892+ tests passing (unit + integration + 30 dedicated hardening tests covering FFmpeg injection, zero-duration guards, boundary clamping, fallback chains). Every external API has a graceful degradation path — DALL-E missing falls back to Pillow gradients, Dolby missing falls back to `loudnorm`, Runway missing falls back to FFmpeg zoompan, LLM SEO missing falls back to rule-based, music-gen missing keeps the rest of the pipeline running. The pipeline can do the entire job offline-first against a local finished track, then opt into each cloud service when its env var is present.

Stack

PythonClick CLIStreamlitFFmpeglibrosaDemucsPillow

Python 3.14 (src layout) · Click CLI (`yt-music`) · Streamlit GUI on port 8501 · FFmpeg for all rendering · librosa for analysis · Demucs for stem separation · Pillow for thumbnail fallback · Chromaprint / AcoustID for fingerprinting · YAML configs for genres / policies / channels / artists / destinations · 12+ JSON schemas validating every artifact · 30+ env vars (every one optional with a fallback).

Next up

Wire the real music-generation provider (Suno or Udio adapter is stubbed but not connected) and finish the YouTube Data API upload adapter so the publish stage stops being a dry-run by default. Then expand beyond two genre families.

More case studies