Models

Overview

Handy supports multiple speech-to-text models. All models run locally on your machine. Nothing is sent to the cloud.

Parakeet

NVIDIA’s Parakeet models are optimized for western languages. Parakeet V3 is the recommended model for most users. It’s fast, accurate, and supports 25 languages.

Parakeet writes out numbers as words (e.g. “one”, “two”, “three”) rather than digits. Language is detected automatically and cannot be manually specified.

Model	Size	Speed	Accuracy	Languages
Parakeet V3	~478 MB	Fast	High	25 European languages
Parakeet V2	~473 MB	Fast	High	English only

Parakeet V3 Supported Languages

EnglishSpanishFrenchGermanPortugueseRussianItalianPolishDutchRomanianUkrainianCzechGreekSwedishHungarianBulgarianDanishFinnishSlovakCroatianLithuanianSlovenianLatvianEstonianMaltese

Whisper

OpenAI’s Whisper models support 99+ languages and optional translation to English. Best choice if you need broad multilingual support.

Whisper outputs numbers as digits (e.g. “1”, “2”, “3”) and allows you to specify which language to transcribe, or let it auto-detect. Whisper Medium is a great balance of speed and accuracy for users with capable hardware.

Model	Size	Speed	Accuracy	Translation
Small	~487 MB	Fast	Good	Yes
Medium	~492 MB	Moderate	Better	Yes
Large	~1.1 GB	Slower	Best	Yes
Turbo	~1.6 GB	Moderate	High	No

Whisper Supported Languages

EnglishChinese (Simplified)Chinese (Traditional)SpanishHindiArabicFrenchPortugueseBengaliRussianJapaneseIndonesianGermanKoreanTurkishVietnameseItalianThaiUrduPolishUkrainianDutchRomanianPersianMalayTamilTeluguTagalogSwahiliCzech

Breeze ASR

A Whisper variant optimized for Taiwanese Mandarin with code-switching support (~1.1 GB).

Moonshine

Moonshine models are lightweight and English-only. Great for quick transcription on lower-powered hardware.

Model	Size	Speed	Accuracy
V2 Tiny	~31 MB	Fastest	Lower
V2 Small	~100 MB	Very fast	Good
V2 Medium	~192 MB	Fast	Better
Base	~58 MB	Very fast	Good

GigaAM v3

GigaAM v3 is a Russian speech recognition model. It supports punctuation, Latin characters, and digits. A good choice for Russian speakers who want fast, accurate local transcription.

Model	Size	Speed	Accuracy	Languages
GigaAM v3	~225 MB	Fast	High	Russian

Canary

NVIDIA’s Canary models support transcription and translation across European languages. They output proper punctuation, capitalization, and inverse text normalization (e.g. “one hundred twenty three” → “123”). The 1B v2 model supports ITN; the Flash model does not.

Canary does not support automatic language detection. When language is set to “auto”, it defaults to English. This means if you speak another language with “auto” selected, Canary will translate your speech into English rather than transcribing it in the original language. You should manually select your language for best results.

Model	Size	Speed	Accuracy	Translation	Languages
Canary 180M Flash	~146 MB	Fast	Good	Yes	4 languages
Canary 1B v2	~692 MB	Moderate	High	Yes	25 languages

Canary 180M Flash Supported Languages

EnglishGermanSpanishFrench

Canary 1B v2 Supported Languages

EnglishSpanishFrenchGermanPortugueseRussianItalianPolishDutchRomanianUkrainianCzechGreekSwedishHungarianBulgarianDanishFinnishSlovakCroatianLithuanianSlovenianLatvianEstonianMaltese

SenseVoice

SenseVoice is a fast model from FunAudioLLM supporting a small set of East Asian languages plus English.

Model	Size	Speed	Accuracy	Languages
SenseVoice	~160 MB	Fastest	Good	5 languages

SenseVoice Supported Languages

ChineseEnglishJapaneseKoreanCantonese

Changing Your Model

Open Handy’s settings and select a model from the dropdown. New models will be downloaded on first use.

Speed vs. Accuracy

Fastest transcription: Moonshine or SenseVoice for English/supported languages
Best all-rounder: Parakeet V3 for European languages
Best for translation: Canary 1B v2 for European language translation
Best for Russian: GigaAM v3
Best multilingual: Whisper Small or Medium
Best accuracy: Whisper Large

Custom Models

As an experimental feature, you can load custom Whisper-compatible models. Place a .bin model file in the models/ directory inside your Handy app data folder (check About for the path). Handy will discover it on the next launch.

Hardware Considerations

Whisper models use GPU acceleration automatically:

macOS: Metal
Windows/Linux: Vulkan

All other models (Parakeet, Canary, Moonshine, SenseVoice, GigaAM) currently run on CPU only. GPU acceleration support is coming soon.

Models

Overview#

Parakeet#

Whisper#

Breeze ASR#

Moonshine#

GigaAM v3#

Canary#

SenseVoice#

Changing Your Model#

Speed vs. Accuracy#

Custom Models#

Hardware Considerations#