OmniVoice is an Apache-2.0 voice model offering text-to-speech, zero-shot voice cloning from a few seconds of audio, and voice design from a text description. It runs locally on NVIDIA GPUs, Apple Silicon, or CPU (with an optional cloud tier), covering 646 languages from a single unified model at roughly 45x real-time.
Best for: Developers who want open, local voice cloning and TTS with broad language coverage