Distributed Wasserstein Barycenteredit

Concept page for Qiao's current doctoral focus on computing Wasserstein barycenters from distributed local measures.

Distributed Wasserstein Barycenter is the concept page for Xinbao Qiao's current doctoral focus within AI and networks. A Wasserstein barycenter is a probability measure that summarizes several input distributions under an optimal-transport distance. In a distributed setting, the input measures are held by different parties, so the problem is not only statistical but also networked: the system must compute or approximate a common reference while respecting communication and data-access constraints.¹

Definitionedit

For local probability measures $\mu_1,\ldots,\mu_K$ with weights $\lambda_k \geq 0$ and $\sum_k \lambda_k = 1$ , a $p$ -Wasserstein barycenter can be written as

\nu^\star \in \arg\min_{\nu \in \mathcal{P}(\mathcal{X})} \sum_{k=1}^{K} \lambda_k W_p^p(\nu, \mu_k).

In a centralized mathematical statement, all $\mu_k$ are available to the solver. In the distributed version relevant to this wiki, each $\mu_k$ may correspond to a local dataset, client, institution, or device. The research question therefore includes what information needs to move across the network, how much can be compressed, and whether the resulting barycenter is useful as a global distributional proxy.

Role in this wikiedit

This page sits between Wasserstein Geometry, Distributed Learning, and Collaborative Evaluation. It explains why a geometric concept appears in Qiao's AI-and-networks line: a barycenter can serve as a shared reference distribution when no party has the complete data distribution. Such a reference can support model evaluation, synthetic-data verification, sample scoring, or comparison across non-identically distributed clients.

The page also follows the LLM-wiki pattern used by Xinbaopedia: instead of leaving "Wasserstein barycenter" as a transient phrase inside a biography, the concept gets its own node. Later papers, notes, or project updates can link back here and refine the local synthesis.

Connection to Qiao's workedit

Qiao's ICML 2026 work on sample-selection bias and model collapse already uses collaborative Wasserstein-style signals to reason about synthetic-data failure under siloed access. The current doctoral focus on distributed Wasserstein barycenters continues that direction at the infrastructure level. It asks how a reliable reference distribution can be computed when the evidence is split across the network, rather than assuming that evaluation data can be pooled first.

This connects to AI and networks because the computational object is shaped by the communication pattern. It connects to Synthetic Data because recursive generation needs distributional checks. It also connects to Data Centric ML because the barycenter can become a tool for deciding which data or samples matter across parties.

Footnotesedit

Agueh and Carlier introduced Wasserstein-space barycenters in a SIAM paper, Barycenters in the Wasserstein Space. Cuturi and Doucet's ICML 2014 paper, Fast Computation of Wasserstein Barycenters, is a standard computational reference. ↩

Distributed Wasserstein Barycenteredit

Definitionedit

Role in this wikiedit

Connection to Qiao's workedit

See alsoedit

Footnotesedit