$ anuragh@portfolio

$ cat ~/work/vani.case

~/work/vani.caseWIP
Vani preview

Vani.

Desktop app · In progress

// A cross-platform desktop dictation app — press a hotkey, speak naturally, and get AI-polished text injected into any active app via Whisper STT and Claude.

role = "Solo builder — product, Electron shell, and AI pipeline"
timeline = "2026 – Present"
stack = "Electron / React / TypeScript / Vite"
impact.md

// Impact at a glance

  • - Global hotkey dictation that injects polished text into any active app without focus loss
  • - Whisper STT + Claude cleanup pipeline — transcription and editing in under 3 seconds
  • - Machine-specific encrypted key storage, content protection, and no backend relay
ElectronReactTypeScriptViteTailwind CSSWhisperOpenAIClaude
summary.md

// summary

Vani is an Electron desktop app for macOS and Windows. Press a global hotkey from any app, speak naturally, and get polished text injected directly into the focused window — powered by OpenAI Whisper for transcription and Claude for cleanup, with local model support planned.

problem.md

// problem

Dictation tools either require constant app-switching or are locked to one input field. There is no frictionless way to speak naturally and get edited text wherever your cursor already is.

// what I built

A floating pill overlay appears on hotkey press, records mic audio with a live waveform, transcribes via Whisper, cleans up with Claude, and injects the result directly into the focused window — all without leaving your current app.

// core experience

  • - Press Cmd+Shift+Space from any app — a floating overlay appears and starts recording immediately
  • - Live waveform feedback with silence auto-stop; transcription and cleanup in under 3 seconds
  • - Full dashboard for history, notes, model settings, and usage — accessible from the system tray
architecture.md

// architecture

  • - Electron 30 main process with IPC surface for transcription, cleanup, text injection, notes, and history
  • - React 18 + Vite renderer for both the dashboard and the floating overlay pill
  • - OpenAI Whisper for STT, Claude for cleanup; electron-store with machine-specific AES encryption

// ai involvement

Whisper handles speech-to-text; Claude cleans up the raw transcript into polished, context-aware prose. Local model runtime via Faster-Whisper and llama.cpp is the next milestone.

challenges.md

// challenges

  • - Injecting text reliably into any active app across macOS and Windows without stealing focus
  • - Keeping the overlay lightweight and hidden from screen recorders with content protection enabled
  • - Machine-specific encryption for API key storage without a backend or cloud dependency

// outcome

Core dictation loop is fully working — hotkey, record, transcribe, clean, inject. Local model runtime and notarization are the remaining milestones before public release.

why.md

// why this matters

It shows I can build ambient, privacy-conscious desktop AI tools that disappear into the user's workflow rather than demanding attention.

reflection.md

// reflection

The hardest part is injection — every app handles focus and input events differently. Reliability here matters more than features.

capabilities.md

// capabilities

Desktop appsReal-time AIVoice interfacesPrivacy-first design
links.md

// links

[NORMAL]·~/anuragh-ragidimilli·main·5 projects·uptime: 100%