VoxScribe
Privacy-first local AI voice dictation with real-time speech-to-text, intelligent text editing, and global text injection.
Why This Project
I built VoxScribe because voice input should be private, fast, and work everywhere — without sending your words to the cloud. Goal: let users dictate into any app with real-time transcription and AI-powered text cleanup, all running locally.
Standout Features
- Real-time voice capture with Web Audio API and live waveform visualization
- Browser-native speech-to-text via SpeechRecognition API with interim results
- Three edit modes: Raw (no changes), Light Edit (grammar + filler removal), Aggressive Rewrite (full tone transformation)
- Tone selector: Casual, Professional, or Concise output styles
- Canvas-based waveform visualizer with bar and wave rendering modes
- Dictation history with copy-to-clipboard support
- Keyboard shortcut support (Space to toggle recording)
- Dark-mode-first glassmorphism UI designed to feel like a native desktop app
Tech Stack
- Next.js 16 + React 19 + TypeScript
- Tailwind CSS v4 + shadcn/ui
- Web Audio API (AnalyserNode for real-time frequency data)
- SpeechRecognition API (browser-native STT)
- Canvas API (waveform rendering)
- Tauri-ready architecture (IPC structure for Rust backend)
Quick Run
pnpm install
pnpm devNo environment variables required — everything runs locally in the browser.