Back/faster-whisper (CTranslate2)

Open SourceMITSelf-hostable

faster-whisper (CTranslate2)

4× faster Whisper inference — streaming-capable, CPU/GPU, production-ready

Website Docs

150ms

Latency (best case) ?

300ms

Latency (typical) ?

2.7%

WER (general audio) ?

Free

Price per minute

Comparative Scores

Accuracy (WER)?10/10

Streaming latency?7/10

Multilingual10/10

Sovereignty?10/10

Price accessibility10/10

Streaming quality?7/10

Architecture

ArchitectureCTranslate2 optimized Whisper (int8 quantization, beam search)

Parameters1.5B (same as Whisper Large v3)

Languages99+

Self-hostable Yes

Streaming ? Yes

WER clean audio ?0.7000000000000002%

GamiWays

Phase 1 MVP — ASR souverain

Recommended for Phase 1 MVP sovereign ASR pipeline. 4× faster than base Whisper, streaming-capable with VAD. Combine with silero-vad for real-time endpointing. Full sovereignty. Benchmark: 150ms vs Deepgram 75ms — acceptable trade-off for Swiss sovereign deployment.

Analysis

faster-whisper is the production-ready variant of Whisper using CTranslate2 for 4× faster inference with the same accuracy. int8 quantization enables CPU deployment. Streaming mode with silero-vad achieves 150ms latency. Used in LiveKit voice agent templates and Audiogami-style deployments.

Strengths

4× faster than original Whisper
Same 2.7% WER accuracy
CPU viable with int8
Streaming mode with VAD
MIT license — full sovereignty
LiveKit integration available

Weaknesses

150ms latency (2× Deepgram)
No speaker diarization
Requires VAD setup for streaming
GPU still recommended for production

STT Capabilities

Streaming ? Yes

Streaming mode with VAD (silero-vad). 150ms latency with chunked processing. Used in whisper-streaming and LiveKit integrations.

Diarization ? No

Custom Vocabulary No

Word Timestamps Yes

Auto Punctuation Yes

Multilingual Yes

99+ languages

Pricing

Price / minute

Free

Price / hour

Free

Free tier

Fully free

Free (self-hosted). GPU compute: ~$0.05–0.20/hour with int8.

Sovereignty & Compliance

On-premise Yes

Full on-premise. MIT license.

GDPR ? Compliant

Data residency: Full control.

On-premise Yes

Full self-hosted. 4× faster than original Whisper. CPU viable with int8 quantization.

Self-hosted Deployment

Full self-hosted. 4× faster than original Whisper. CPU viable with int8 quantization.

Strategic & Business Analysis

faster-whisper (CTranslate2) — Strategic Positioning

Beyond technical specs: where does this tool sit in the ecosystem, what are the risks and strategic implications for GamiWays?

Faster-Whisper is the production multiplier for open-source STT: 4× faster inference, 50% less memory, MIT license — the engineering bridge between Whisper's accuracy and real-time deployment requirements.

Open-source / self-hosted

Lock-in risk:Low

Sovereignty fit:High

Open-source threat:Low

Pricing:Stable →

A. Strategic Positioning

Target customer: Developer / Enterprise — optimized inference, resource-constrained deployments

4× faster Whisper inference with 50% less memory via CTranslate2 — the production-ready Whisper for self-hosted deployments.

B. Competitive Moat

4× faster inference than standard Whisper with 50% less memory — production-grade optimization
CTranslate2 optimization — runs efficiently on CPU and GPU
MIT license — zero licensing cost, full commercial use

Vulnerability: Dependent on Whisper model quality. Community-maintained — no commercial support. Newer optimized models may supersede.

E. Strategic Questions for GamiWays

Sovereignty fit

Fully self-hostable on Swiss/EU infrastructure. MIT license. 4× speed improvement makes self-hosted Whisper viable for real-time applications.

Build vs. Buy

Build (integrate) for Phase 2 sovereignty. The 4× speed improvement makes it the default choice for self-hosted Whisper deployments.

Lock-in risk

MIT open-source — zero vendor lock-in. Dependency on Whisper model architecture is the only constraint.

Roadmap alignment

Excellent for Phase 2 sovereignty. Makes self-hosted Whisper competitive with cloud STT in terms of latency.

Back to Speech Recognition View in Benchmarks

Data Freshness

Updated 30 April 2026

SYSTRAN faster-whisper benchmarks 2025

Update note: faster-whisper v1.1.0 (Jan 2025). 4× faster than original Whisper on GPU, 2× on CPU with int8. Streaming mode with silero-vad confirmed. LiveKit voice agent template uses faster-whisper.

Reference Sources

faster-whisper GitHubdocs faster-whisper Benchmarksbenchmark LiveKit Voice Agent Templatedocs