Back/D-ID (Expressive V4)
commercialSovereignty 1/5

D-ID (Expressive V4)

Enterprise AI agents with expressive avatars and native RAG

TTFR Latency

~450ms

real-time

Cost / minute

$0.400/min

real-time

Visual Quality

8/10

estimated score

Protocols

WebRTC, REST

Avatar Customisation

RAG / Knowledge Base

Native 'Knowledge' API: connect custom knowledge bases (PDF, text). Agent queries them in real-time during conversation.

Behavior & Personality

System prompt configuration for personality, tone, response style. Full agent persona definition.

Body Language & Gestures

V4 Expressive: adaptive body language generated to match voice inflection and emotional context. Natural posture.

Facial Expressions

V4 Expressive: emotion and micro-expression control via sentiment tags (joy, seriousness, surprise).

Voice & Voice Cloning

ElevenLabs + Microsoft Azure TTS partnerships. Hundreds of multilingual voices. Pitch, speed, accent control. Voice cloning available.

Persona Fine-Tuning

Full agent persona: system prompt + knowledge base + voice + visual identity. Consistent role maintenance throughout interaction.

Avatar Training

Video required Yes
Duration3–5 minutes (Premium+ avatars)
Resolution1080p minimum at 30fps
FormatMP4 or MOV
Consent required Yes (mandatory)
Processing timeSeveral hours

Best Practices

  • 01.Frontal uniform lighting, no shadows
  • 02.Neutral fixed background
  • 03.Stable camera at eye level
  • 04.Natural speech with pauses
  • 05.Avoid excessive head movements
  • 06.Clean audio recording (no background noise)
  • 07.No face occlusions (sunglasses, hands)

API Analysis

Protocols

RESTWebRTC

SDKs

JavaScriptPython
Webhooks Yes

Concurrent Sessions

Plan-dependent

Rate Limits

BYO-S3 available on Pro+ plans

Key Features

  • Knowledge API for native RAG
  • Agents API for full conversational agent management
  • Expressive V4 emotion tags
  • ElevenLabs + Azure TTS integration
  • WebRTC real-time streaming

API Constraints

  • Expressive V4 requires specific credits (not all stock avatars)
  • No on-premise standard option
  • Data stored on D-ID infrastructure by default (BYO-S3 on Pro+)
  • Celebrity face moderation

Pricing Model

Model: Credit-based (1 credit = 15s video)
PlanPriceIncluded minutesOverage
TrialFree (14 days)LimitedN/A
Build$18/mo32 min streaming~$0.56/min
Launch$40/mo90 min streaming~$0.44/min
Scale$158/mo400 min streaming~$0.40/min
EnterpriseCustomCustomCustom
Free tier
Cloud only
Enterprise pricing

Hidden costs / watch out

  • Expressive V4 credits premium
  • BYO-S3 only on Pro+
  • 20–30% discount on annual commitment

Sovereignty & Hosting

Sovereignty Score

1/5

Hosting

AWS US-East-1

GDPR

Yes

On-premise

No

Sovereignty detail

Israeli company. AWS US-East-1. No EU hosting. GDPR principles respected but no EU datacenter.

Constraints & Limits

  • No manual hand/arm gesture control
  • US hosting only (AWS US-East-1)
  • Expressive V4 not available on all stock avatars
  • Celebrity face moderation
  • No on-premise standard option

DigiDouble Relevance

Score

7/10

Strong native RAG and expressive emotion control make D-ID compelling for educational agents. V4 Expressive is a significant upgrade. Main limitation: US-only hosting and higher cost per minute vs. Simli/bitHuman.