VoxScribe

Privacy-first local AI voice dictation with real-time speech-to-text, intelligent text editing, and global text injection.

VoxScribe

Privacy-first local AI voice dictation with real-time speech-to-text, intelligent text editing, and global text injection.

Why This Project

I built VoxScribe because voice input should be private, fast, and work everywhere — without sending your words to the cloud. Goal: let users dictate into any app with real-time transcription and AI-powered text cleanup, all running locally.

Standout Features

  • Real-time voice capture with Web Audio API and live waveform visualization
  • Browser-native speech-to-text via SpeechRecognition API with interim results
  • Three edit modes: Raw (no changes), Light Edit (grammar + filler removal), Aggressive Rewrite (full tone transformation)
  • Tone selector: Casual, Professional, or Concise output styles
  • Canvas-based waveform visualizer with bar and wave rendering modes
  • Dictation history with copy-to-clipboard support
  • Keyboard shortcut support (Space to toggle recording)
  • Dark-mode-first glassmorphism UI designed to feel like a native desktop app

Tech Stack

  • Next.js 16 + React 19 + TypeScript
  • Tailwind CSS v4 + shadcn/ui
  • Web Audio API (AnalyserNode for real-time frequency data)
  • SpeechRecognition API (browser-native STT)
  • Canvas API (waveform rendering)
  • Tauri-ready architecture (IPC structure for Rust backend)

Quick Run

pnpm install
pnpm dev

No environment variables required — everything runs locally in the browser.