Remote Voice Agents

As I’ve spent more time in business ideation mode, I’ve written a lot of docs to kick the tires on ideas.

Typing works, but voice chat with ChatGPT while driving has been illuminating. Having an LLM to bounce ideas off of, go do searches, etc, is pretty great. Side note: Claude’s voice agent version is nowhere close to the fluidity in OpenAI’s current implementation.

Both share the downside of not being able to save generated docs anywhere. Tool calls with interruptible voice agents seem a bit tricky, as I found out.

Beyond a single file edit, it’s quite helpful to be able to ask Claude Code to evaluate multiple documents or make files as linked derivatives or supporting elements.

I’ve also wanted the option to kick of Claude Code to do things while driving, either to build something, or to explore an existing code base for answers.


The goal

A web app running on my home dev machine with external access to voice control a file editor plus Claude agent runner. Essentially a voice product development assistant.


Results

A working web app that uses OpenAI’s realtime API for voice control and the Claude Agent SDK to run jobs and voice summarize them back to me. This runs locally on my macbook at home, utilizing Tailscale to securely expose the web app to my phone for voice chat on the road.

Learnings

UX

Tech


Current status: In personal use, iterating

Atlas is here, bringing browser use (and probably realtime voice) to doc editing in all existing spec formats. Seems like it will eat a lot of this white space.

Claude Code cloud and Codex cloud are here - getting us away from our local machines.

Seems like the pieces are coming together for a fully cloud based voice spec + code agent jobs framework. That cloud base voice agent enabled unified spec system seems like white space to fill in. Maybe collaborative specification?