DirectorKit Research

Experiments and research progression

A small public-facing index for prototypes, listening tests, milestone-style presentations, and demos that show how the speech stack is improving over time.

Current demos in apps/experiments

Each card is meant to be a short presentation: what we tested, why it matters, what to listen or look for, and where to open the demo.

New Video Page

Translation Functionality Demo

Placeholder presentation page for the translation video you already recorded, with space for outcome notes and next research questions.

video translation demo notes
Open video demo
Interactive Audio

CLSP Sphere Sorting Demo

Audits nearest-neighbor sample selection along CLSP prompt axes such as calm/nervous, shouting/whispering, and child/old person.

CLSP prompt axes static app
Open CLSP demo
Research Harness

Timestretch Algorithm Evaluation

Compares waveform stretch, Ruchey-lane stretch, and SDEdit-assisted stretch strategies for vowel-biased timing edits.

timestretch C/V probe warping
Read evaluation notes
Symlinked Demo

Ruchey VoiceTransform Demo

Local symlink to the VoiceTransformDiT CLSP SDEdit demo under apps/voice-daw/Ruchey_Tokenizer.

voice transform CLSP SDEdit GPU demo
Open symlinked demo
Recording Tool

Voice Dataset Recording Actor Control

Browser control surface for recording actors and dataset capture workflows used by the voice data pipeline.

dataset recording actor control
Open local app

Show audio quality as a sequence, not a single best clip

The most useful research presentation is a controlled listening ladder: same phrase, same speaker, same prompt or transformation, then one clip per model version or pipeline stage.

Use the audio comparison layout
01

Anchor

Original recording or target reference so listeners know the goal.

02

Baseline

Early Ruchey reconstruction or known failure mode, kept for contrast.

03

Current Best

Latest model pass with short notes on what improved and what remains.

Suggested page types

Listening Test

Audio ladder, one insight per clip, short notes for artifacts and wins.

Video Walkthrough

Single video plus transcript bullets, setup details, and open questions.

Metric Snapshot

Compact table of objective metrics alongside a few representative clips.

Before / After

Two-column comparison for transformations like prompt edit or identity swap.

Ways to make audio progress obvious

Use matched phrases

Keep the text, speaker, and prompt constant across clips so the model change is the only thing people hear.

Name the artifact

Add one short artifact label per clip: buzzy vocoder, duck voice, smeared consonants, pitch drift, identity bleed, or cleaner transients.

Show pipeline stages

For Ruchey, split encoder reconstruction, BridgeDiT refinement, vocoder choice, and prompt-conditioned transform into separate rows.

Keep a failure gallery

Curated failures are persuasive because they make the later improvement concrete and help explain why a new architecture mattered.

Deploy the landing page

This folder includes a minimal Node static server, Dockerfile, Cloud Build config, and Makefile defaults for the DirectorKit dashboard project.

cd apps/experiments
make deploy
make map-domain