Platform for creating interactive conversational experiences with video avatars — combining real-time AI dialogue, photorealistic avatar generation, and intelligent cinematographic sequencing.
This portal documents the fundamental research challenges for the Memoways × Gamilab × IDIAP collaboration, within an Innosuisse project.
| Axis | Challenge | Researcher | IDIAP Group | Status |
|---|---|---|---|---|
| AX1 Conversational Memory | The Conversational Memory Gap | Dr. Elena Epure | Language & Information Technologies | PRIMARY |
| AX2 Expressive Avatar & Behavioral Fidelity | The Behavioral Fidelity Gap | Dr. Mathew Magimai-Doss | Speech & Audio Processing | PRIMARY |
| AX3 Deterministic-Organic Orchestration | Balance narrative constraints / AI conversational freedom | Internal team | Architecture | SECONDARY |
| AX4 Multi-Stream Synchronization | Coordinate 5 streams <100ms desync | Memoways | Internal Engineering | INTERNAL |
Each exchange passes through 6 stages — avatar generation is the main bottleneck (5–15s currently, target 500ms).
Current avatar platforms (HeyGen, Synthesia) produce high-quality video but with 15–40 second latency per exchange — incompatible with natural conversation. Real-time solutions (NVIDIA ACE, Beyond Presence) require proprietary infrastructure and do not allow behavioral personalization from existing archives. DigiDouble aims to bridge this gap: sovereign, open, personalized, and real-time.
The fundamental challenge: achieve a 10–20× latency reduction while preserving behavioral fidelity of the specific person — a problem at the intersection of speech processing, computer vision, NLP, and systems engineering.