SottoASR

Local, privacy-first speech-to-text for macOS

Press a hotkey, speak, and text appears at your cursor.
All processing happens on-device — no audio or text ever leaves your Mac.

Windows coming soon

Open Source · MIT Licensed · 100% Local

Everything you need, nothing you don't

Completely Private

All audio capture and transcription happens on-device. No cloud APIs, no telemetry, no data collection. Your words stay on your Mac.

Blazing Fast

~44x real-time transcription on Apple Silicon. Powered by CoreML and the Apple Neural Engine for hardware-accelerated inference.

Push-to-Talk

Press Cmd+Shift+Space to talk, release to transcribe. Or use Cmd+Shift+D to toggle hands-free recording. All shortcuts are fully customizable.

AI Transcript Cleanup

On-device LLM automatically removes filler words, fixes punctuation, and polishes your transcriptions — all locally with Qwen 3.5.

25 Languages

Multilingual speech recognition powered by NVIDIA Parakeet TDT v3 — a 600M parameter model supporting 25 languages out of the box.

Menu Bar App

Lives quietly in your menu bar. No dock icon, no distractions. Appears only when you need it with a beautiful floating overlay.

Transcription History

Browse, search, and copy past transcriptions. Every recording is saved locally so you never lose what you said.

Paste at Cursor

Transcribed text is automatically pasted wherever your cursor is. Works in any app — editors, browsers, terminals, chat windows.

Lightweight

Just 14 MB app bundle. Models are downloaded once on first launch (~500 MB) and cached locally. Minimal resource usage when idle.

Get started in seconds

v0.2.4 Latest Release

Requirements

  • macOS 14 Sonoma or later
  • Apple Silicon (M1, M2, M3, M4)
  • ~500 MB disk space for models
  • Accessibility permission (for paste-at-cursor)
Download for macOS

Download the .dmg from the GitHub release page

Windows support is coming soon.

Built in the open, for everyone

SottoASR is fully open source under the MIT License. Inspect the code, contribute features, report bugs, or fork it and make it your own. Privacy you can verify — not just trust.

Built on the shoulders of giants

SottoASR relies on outstanding open-source libraries and models. All 660+ dependencies use permissive or weak-copyleft licenses (MIT, Apache-2.0, BSD, MPL-2.0, Unicode-3.0, ISC, Zlib, CC-BY-4.0). See THIRD_PARTY_LICENSES for the full list.

Speech Recognition

NVIDIA Parakeet TDT v3
ASR model, 600M params, 25 languages
CC-BY-4.0
FluidAudio
CoreML/ANE inference engine (Swift)
Apache-2.0
parakeet-rs
ONNX Runtime Rust bindings
MIT
cpal
Cross-platform audio capture
Apache-2.0
hound
WAV encoding/decoding
Apache-2.0
rubato
Audio resampling
MIT

AI Transcript Cleanup

Qwen3.5-0.8B
LLM by Alibaba Cloud (Qwen team)
Apache-2.0
Apple MLX
Metal-native ML framework
MIT
mlx-lm
MLX language model inference
MIT
huggingface_hub
Model download and caching
Apache-2.0

Application Framework

Tauri v2
Native app shell (Rust + Web)
MIT
Svelte 5
Reactive UI framework
MIT
tauri-nspanel
macOS NSPanel overlay windows
MIT
Vite
Frontend build tool
MIT
Tokio
Async Rust runtime
MIT
serde
Serialization framework
MIT

Author

  • Juan Villa

Contributors

  • Ian Scofield

Built with significant assistance from Claude Code by Anthropic.

Want to contribute? Check out the GitHub repository.