Target Architecture
Overview of architectural blocks: available (green), R&D required (blue), Memoways internal (yellow). The <2s latency budget structures all choices.
AvailableMature commercial or open-source solutions
R&DFundamental research required — core of the Innosuisse project
Memoways internalExisting expertise and infrastructure (14 years)
Click to expand
The <2s latency constraint structures everything
Latency is not just a technical problem — it is a user experience problem. Beyond 2 seconds, users lose their train of thought, the avatar stops being a presence and becomes a tool. DigiDouble's goal is to cross the conversational naturalness threshold: <2s end-to-end, with first sound within 500ms.
Cognitive thresholds of perceptive latency
| Threshold | Qualification | UX Impact | Achievable (DD) |
|---|---|---|---|
| <500ms | Perceptive fluidity | Perceptive fluidity threshold. User perceives slight delay but interaction remains natural. Target for TTS first audio. | ✓ Yes |
| 1s | Acceptable | Conversational comfort threshold. Beyond this, users start anticipating the wait. Target for TTFB (first video frame). | ✓ Yes |
| 2s | Natural limit | Conversational naturalness threshold (Nielsen 1993, validated by human dialogue research). Beyond this, conversation becomes a series of waits. DigiDouble TTFR target. | R&D Goal |
| 6–12s | Engagement break | Current DigiDouble latency (HeyGem OS). User loses the thread, avatar stops being a presence. High drop-off rate. This is the problem to solve. | Current problem |