Back/State of the Art/Runway Characters

commercialSovereignty 1/5

Runway Characters

Zero-shot avatar from image with tool calling and RAG

Website API Docs

TTFR Latency

~400ms

real-time

Cost / minute

$0.210/min

real-time

Visual Quality

8/10

estimated score

Protocols

WebRTC, REST

Avatar Customisation

RAG / Knowledge Base

Upload .txt files as knowledge base. Avatar queries them in real-time. Tool Calling enables external action triggers during conversation.

Behavior & Personality

Personality via system prompt (overridable per session). Voice style presets (professional, authoritative, etc.).

Body Language & Gestures

Fixed by reference image. Model generates natural body language but no granular API control.

Facial Expressions

GWM-1 generates natural micro-expressions and facial movements. Not manually scriptable.

Voice & Voice Cloning

Voice presets (Clara, Vincent, etc.) with style variations. No custom voice cloning via Characters API at launch.

Persona Fine-Tuning

System prompt overridable per session call. Tool Calling for external integrations. Starting script customisable.

Avatar Training

Video required No (image sufficient)

DurationSingle reference image

Resolution1088×704px recommended, 16:9 ratio

FormatJPEG / PNG

Consent required No

Processing timeSeconds (no training required)

Best Practices

01.Front-facing, well-lit, face centered
02.No obstructions (sunglasses, hair over face)
03.Works with human, stylized, or mascot images
04.No audio samples needed for built-in voices

API Analysis

Protocols

RESTWebRTC

SDKs

React SDK (official)Python SDK

Webhooks Yes

Concurrent Sessions

Plan-dependent

Rate Limits

Session max: 5 minutes. Pre-provisioning required (NOT_READY → READY state).

Key Features

Avatar ID management via REST
Session lifecycle: NOT_READY → READY → RUNNING → COMPLETED/FAILED
Dynamic personality override per session
Tool Calling for external action triggers
Webhooks: READY, RUNNING, COMPLETED, FAILED

API Constraints

5-minute maximum session duration
Pre-provisioning required before WebRTC connection
No custom voice cloning via Characters API
US hosting only

Pricing Model

Model: Credit-based (1 credit = $0.01)

Plan	Price	Included minutes	Overage
Pay-as-you-go	Credits only	~$0.21/min	N/A
Standard	~$12/mo	Included credits	$0.21/min
Pro	~$28/mo	More credits	$0.21/min
Enterprise	Custom	Custom	Custom

No free tier

Cloud only

Enterprise pricing

Hidden costs / watch out

2 credits init per session ($0.02 fixed)
Runway subscription may be required for advanced features

Sovereignty & Hosting

Sovereignty Score

1/5

Hosting

AWS US

GDPR

Yes

On-premise

Sovereignty detail

AWS US. SOC2 Type II. GDPR/CCPA compliant. No EU hosting, no on-premise.

Constraints & Limits

5-minute maximum session duration
No custom voice cloning in Characters API
No manual body gesture control
US hosting only
Pre-provisioning adds latency before session start

DigiDouble Relevance

Score

7/10

Tool Calling is a unique differentiator for interactive educational scenarios. Zero-shot from image is excellent for rapid prototyping. 5-min session limit is a significant constraint for DigiDouble's longer interaction scenarios.

← Back to State of the Art Research Challenges →