About this portal

DigiDouble Research Portal — an independent reference site on the state of the art in voice pipelines and conversational video avatars.

What is this portal?

This portal documents the state of the art in speech synthesis technologies (STT, TTS) and conversational video avatars in 2025–2026. It was created in the context of the DigiDouble project — an Innosuisse research project aiming to create a photorealistic conversational avatar from existing video archives — but its content is designed to be useful for any project involving these technologies.

The portal covers 10 STT engines, 16 TTS engines, and 11+ video avatar platforms, with benchmarks, strategic analyses, interactive decision tools, and evaluation frameworks. It is regularly updated to reflect the rapidly evolving market.

Editorial philosophy

This portal does not recommend specific solutions. It presents options, metrics, questions to ask, and decision frameworks. The goal is to help teams understand the technology landscape and make informed decisions based on their own constraints — data sovereignty, budget, latency, technical expertise.

🎯

Open questions

Each section asks the real decision questions, without imposing an answer.

⚖️

Vendor neutrality

No commercial partnerships. Data comes from public benchmarks and independent tests.

🔄

Continuous updates

The market moves fast. This portal is updated to reflect new releases and pricing changes.

Portal structure

The ProjectExplore →

Context and vision of the DigiDouble project: objectives, research challenges, target architecture, identified gaps, and academic state of the art. DigiDouble-specific content.

Voice PipelineExplore →

STT (10 engines) and TTS (16 engines) comparisons, audio synthesis benchmarks, layer-by-layer decision framework, custom scoring, and interactive V2V pipeline diagram. Useful for any voice project, independently of DigiDouble.

Video AvatarsExplore →

Comparison of 11+ streaming video avatar platforms, interactive cost simulator, business & market challenges, behavior & expressiveness, emotional toolbox.

Portal presentation (this page) and glossary of technical terms used in the portal.

Technical context

The DigiDouble project is an Innosuisse research project led by the IDIAP laboratory (Institute for Perceptual Artificial Intelligence) in partnership with Memoways and Gamilab. The goal is to create a photorealistic conversational avatar of a person from their existing video archives, capable of interacting in real time with users.

Research partner

IDIAP

Industry partners

Memoways, Gamilab

Funding

Innosuisse

Expected start

Autumn 2026

STT engines covered

10

TTS engines covered

16

Unfamiliar with a term?

The glossary explains the 30+ technical terms used in this portal: WER, TTFA, ELO, SSM, diarization, sovereignty, lock-in, and many more.

View glossary →