Ruchey Audio Quality Progression
Compare source clips, old checkpoints, BridgeDiT/RNDVoC/Flow2GAN passes, and newer model generations in one repeatable listening-test layout.
A small public-facing index for prototypes, listening tests, milestone-style presentations, and demos that show how the speech stack is improving over time.
apps/experimentsEach card is meant to be a short presentation: what we tested, why it matters, what to listen or look for, and where to open the demo.
Compare source clips, old checkpoints, BridgeDiT/RNDVoC/Flow2GAN passes, and newer model generations in one repeatable listening-test layout.
Placeholder presentation page for the translation video you already recorded, with space for outcome notes and next research questions.
Audits nearest-neighbor sample selection along CLSP prompt axes such as calm/nervous, shouting/whispering, and child/old person.
Compares waveform stretch, Ruchey-lane stretch, and SDEdit-assisted stretch strategies for vowel-biased timing edits.
Local symlink to the VoiceTransformDiT CLSP SDEdit demo under
apps/voice-daw/Ruchey_Tokenizer.
Browser control surface for recording actors and dataset capture workflows used by the voice data pipeline.
The most useful research presentation is a controlled listening ladder: same phrase, same speaker, same prompt or transformation, then one clip per model version or pipeline stage.
Use the audio comparison layoutOriginal recording or target reference so listeners know the goal.
Early Ruchey reconstruction or known failure mode, kept for contrast.
Latest model pass with short notes on what improved and what remains.
Audio ladder, one insight per clip, short notes for artifacts and wins.
Single video plus transcript bullets, setup details, and open questions.
Compact table of objective metrics alongside a few representative clips.
Two-column comparison for transformations like prompt edit or identity swap.
Keep the text, speaker, and prompt constant across clips so the model change is the only thing people hear.
Add one short artifact label per clip: buzzy vocoder, duck voice, smeared consonants, pitch drift, identity bleed, or cleaner transients.
For Ruchey, split encoder reconstruction, BridgeDiT refinement, vocoder choice, and prompt-conditioned transform into separate rows.
Curated failures are persuasive because they make the later improvement concrete and help explain why a new architecture mattered.
This folder includes a minimal Node static server, Dockerfile, Cloud Build config, and Makefile defaults for the DirectorKit dashboard project.
cd apps/experiments
make deploy
make map-domain