Why this exists
Most speech tools stop at a transcript box. We think that is incomplete. Voice should handle capture, editing, long-form document work, command execution, and review.
The product spans desktop binaries, browser surfaces, and an API because people need the same voice layer everywhere they already work.
Real-time speech
Fast dictation matters because voice only feels natural when the text keeps up. We optimize for low-latency transcription that is usable inside real work.
Command execution
Speech should not stop at transcription. Command mode turns spoken intent into edits, transforms, and app actions so you can keep moving.
Research and playback
Documents should be something you can build, revise, and listen to. That is why deep research and document audio sit alongside dictation instead of being separate products.