Target Architecture

Overview of architectural blocks: available (green), R&D required (blue), Memoways internal (yellow). The <2s latency budget structures all choices.

AvailableMature commercial or open-source solutions
R&DFundamental research required — core of the Innosuisse project
Memoways internalExisting expertise and infrastructure (14 years)
AvailableR&DMemoways InternalCritical bottleneckUSERVoice/TextAVAILABLE — GamilabSovereign ASR + STTSwiss-hosted · HITL optional~300ms targetR&DAXIS 1Memory3-layer arch.AXIS 2aExpressive TTSPersonalized prosodyAXIS 2b ⚠Avatar GenerationBehavioral fidelity⚠ <500ms targetAXIS 3OrchestrationDeterministic-organicArchitecture challengeINTERNAL — MemowaysNode EditorConversation graphConfigurable PlayerMode pédagogique / Mode narratifEXPERIENCE<2s targetTARGET LATENCY BUDGET<300msASR+STT<200msOrchestration<500msSLM+LLM<200msTTS<500msAvatar (R&D)<300msStreaming= <2stotal targetAll values are R&D targets — end-to-end benchmarks planned spring 2026
Click to expand

The <2s latency constraint structures everything

Latency is not just a technical problem — it is a user experience problem. Beyond 2 seconds, users lose their train of thought, the avatar stops being a presence and becomes a tool. DigiDouble's goal is to cross the conversational naturalness threshold: <2s end-to-end, with first sound within 500ms.

Cognitive thresholds of perceptive latency

ThresholdQualificationUX ImpactAchievable (DD)
<500msPerceptive fluidityPerceptive fluidity threshold. User perceives slight delay but interaction remains natural. Target for TTS first audio.✓ Yes
1sAcceptableConversational comfort threshold. Beyond this, users start anticipating the wait. Target for TTFB (first video frame).✓ Yes
2sNatural limitConversational naturalness threshold (Nielsen 1993, validated by human dialogue research). Beyond this, conversation becomes a series of waits. DigiDouble TTFR target.R&D Goal
6–12sEngagement breakCurrent DigiDouble latency (HeyGem OS). User loses the thread, avatar stops being a presence. High drop-off rate. This is the problem to solve.Current problem