bitHuman
Edge-deployable avatar with gesture API and self-hosted option
TTFR Latency
~300ms
real-time
Cost / minute
$0.050/min
real-time
Visual Quality
7/10
estimated score
Protocols
WebRTC, REST, LiveKit
Avatar Customisation
RAG / Knowledge Base
Agent Context API: inject 'silent knowledge' (documents, data) into the avatar's context. CO-STAR framework for structured knowledge.
Behavior & Personality
CO-STAR framework: Context, Objective, Style, Tone, Audience, Response. Emotional tone (empathetic, calm). Structured response style.
Body Language & Gestures
Dynamics API: trigger specific gestures via keywords or API commands (hand signs, nods, laughs). Unique feature among commercial platforms.
Facial Expressions
Real-time facial expressions at 25fps, synchronized with audio (lip-sync). Emotion-driven micro-expressions.
Voice & Voice Cloning
Voice cloning via Voice Upload. OpenAI and ElevenLabs TTS integration. Custom voice persona.
Persona Fine-Tuning
CO-STAR prompt framework + Dynamics API for gesture triggers. Persona can include non-human characters (animals, illustrated characters).
Avatar Training
Best Practices
- 01.Image: frontal, well-lit, neutral expression
- 02.Video (Likeness): single person, centered, minimal movement
- 03.Max 30 seconds video
- 04.No complex gestures in training footage
- 05.Expression model requires GPU for custom faces outside catalog
API Analysis
Protocols
SDKs
Concurrent Sessions
Plan-dependent
Rate Limits
250 credits fixed cost for initial agent generation
Key Features
- Dynamics API: gesture triggers via keywords or API
- CO-STAR framework for structured persona
- Self-hosted deployment (CPU or GPU)
- Edge deployment on ARM/x86 hardware
- Offline operation possible
- LiveKit integration for WebRTC streaming
API Constraints
- Expression model (GPU) required for custom faces outside catalog
- LiveKit dependency for real-time streaming
- Manual memory persistence via context injection
- No webhook support
Pricing Model
| Plan | Price | Included minutes | Overage |
|---|---|---|---|
| Free | $0/mo | ~99 min (Essence) | N/A |
| Basic | $10/mo | 700 credits | $0.01/min (Essence) |
| Pro | $25/mo | 2300 credits | $0.01/min (Essence) |
| Creator | $75/mo | 10000 credits | $0.01/min (Essence) |
| Business | $200/mo | 30000 credits | $0.01/min (Essence) |
Hidden costs / watch out
- 250 credits fixed for agent generation
- Expression model (GPU): 4 credits/min vs 1 credit/min for Essence
- GPU hardware cost if self-hosted
Sovereignty & Hosting
Sovereignty Score
Hosting
Self-hosted (on-premise, CPU or GPU) OR cloud. Full sovereignty option.
GDPR
YesOn-premise
YesSovereignty detail
Full self-hosted option (on-premise, CPU or GPU). Total data sovereignty. No cloud dependency required.
Constraints & Limits
- Expression model requires GPU (higher cost)
- LiveKit dependency for streaming
- No webhook support
- Manual memory persistence required
- Body customisation limited vs. facial expressions
DigiDouble Relevance
Score
8/10
Unique Dynamics API for gesture control is a key differentiator. Self-hosted CPU deployment enables total sovereignty and 10x cost reduction. Ideal for DigiDouble's Swiss sovereignty requirements. Limitation: Expression model requires GPU for custom faces.