Case study

4 hours to 45 minutes

An AI-governed editorial platform with 7 specialised agents, a 7-step pipeline with quality gates, and institutional-grade output — for under $3 per run.

The challenge

Content production doesn't scale

Manual production bleeds time

Hours of raw audio become weeks of editorial work. Research, transcription, writing, fact-checking, translation — each step requires a specialist, and each specialist costs. A single magazine issue ties up an entire team. Automating content operations isn't optional when your production cadence outpaces your headcount.

AI output fails editorial standards

Off-the-shelf AI writes generic content that fails institutional requirements. No source verification, no editorial voice compliance, no audit trail. Under EU AI Act transparency rules, AI-generated content must be traceable — who wrote what, from which sources, with what confidence level.

Scaling means losing governance

One AI pipeline is manageable. Seven agents across two departments, each with different models, budgets, and quality requirements? That demands a structured governance layer — not more prompts.

What was built

An AI newsroom

1

Audio processing pipeline

Raw audio ingested via smart-split at silence boundaries, normalized to broadcast standards with 2-pass loudnorm, then transcribed via Whisper with speaker identification. SHA256 caching means reprocessing identical audio costs nothing.

Clean transcriptions
2

Agent architecture

7 specialised agents across 2 departments: Data Operations and Editorial. Each agent has a defined role, model assignment — Sonnet for volume, Opus for judgment — budget ceiling, and governance rules coordinated through a central control plane.

Governed agent roster
3

7-step editorial pipeline

Research, Write, Caption, Review, Voice Consistency, Voice Review, Translate. Every stage has quality gates: RAG-based fact-checking, editorial DNA compliance scoring, and confidence thresholds that tighten with each iteration.

Publication-ready content
4

Non-technical interfaces

A dashboard for entity management and pipeline monitoring, plus a content studio designed for users who have never touched an AI tool — brief-assisted workflows, parallel generation across 5 formats, and a collaborative editor with review workflows.

Self-service tools
Measurable outcomes

Production-grade, not prototype-grade

0
Specialised AI agents

Organised into 2 departments — Data Operations and Editorial — each with distinct model routing. Lightweight models handle transcription refinement; frontier models handle editorial judgment. No single-model bottleneck.

0%
Editorial DNA compliance

A 12-principle calibration system verifies every generated section against institutional voice specifications. Measured against published reference issues — not estimated. Each principle is weighted by priority, with confidence decay preventing infinite revision loops. Data provenance built into every claim.

<$3
Per complete magazine

Smart model routing keeps costs predictable: per-section budget caps at 15%, a 70% warning threshold, and a 2x circuit breaker that halts runaway spending. Crash recovery resumes from the last checkpoint — no wasted tokens.

Tech stack

Built for production

Agent orchestration
PaperclipMCP SDKClaude SonnetClaude OpusGPT-5.1Mastra.ai
Editorial pipeline
TypeScriptFastifySQLiteDrizzle ORMZodNext.js 16
Audio & voice
WhisperQwen3-TTSSilero VADDemucsffmpegRedis Streams
Quality & observability
Vitest1500+ testsRAG fact-checkingUpstash VectorLangfuse
Frequently asked questions

Common questions

How do AI agents maintain editorial voice across all content?

Every agent operates under a 12-principle editorial DNA system with weighted priorities. Generated sections are scored against calibration targets derived from published reference issues. Confidence thresholds tighten with each revision — if an agent can't meet the standard within budget, it escalates to human review rather than degrading quality. This is how production AI automation differs from prompt engineering.

What happens when an agent produces factually incorrect content?

Every factual claim is verified through RAG-based fact-checking against source transcriptions, event programs, and speaker bios. The fact-checker assigns confidence scores (0–1) to each claim. Below threshold, the section gets flagged for rewrite with the specific failing claims highlighted. The data research system ensures provenance is built into the data model — every fact traces to its source.

Can this approach work for content that isn't audio-based?

The agent architecture is source-agnostic — audio processing is just the first ingestion module. The editorial pipeline, quality gates, and governance layer work identically with documents, web research, structured data, or any combination. Content operations built this way adapt to new input types. The same approach powers custom business tools across completely different domains.

Need a similar pipeline?

Let's talk about automating your editorial or research workflow.