Researchedit

Research overview for Qiao Xinbao.

This page summarizes the main research directions in Qiao Xinbao's academic wiki. It functions as a compiled map of linked topic pages rather than a static list of interests. The current center of gravity is data-centric ML and the two-way AI-and-networks problem.

Research thesisedit

Qiao's work primarily studies lifecycle management of data in AI models, focusing on theoretical methods and practical problems that arise as data are generated, used, and deleted. The related work aims to improve the reliability, interpretability, and controllability of AI models in heterogeneous, computation-constrained, and communication-constrained environments.

In data generation, it studies synthetic data and its effects on quality, privacy, and generalization.
In data use, it focuses on data modeling, collaborative optimization, and system design in distributed/federated learning, AI for Networks, and Networks for AI.
In data deletion, it studies machine unlearning and data influence evaluation, exploring how to preserve model performance while protecting privacy and satisfying deletion requests.

AI and networksedit

AI and Networks covers the intersection of AI with networking and communication systems: AI for Networks, Networks for AI, decentralized learning, data pruning, and collaborative evaluation. In the current CUHK doctoral stage, this line is paired with data-centric ML and includes distributed tools such as Wasserstein barycenters, where multiple local distributions can be combined into a shared distributional reference without treating raw-data pooling as the default assumption.

Machine unlearningedit

Machine Unlearning studies certified data removal and low-cost update mechanisms after deletion requests. Related pages include Hessian-Free Online Certified Unlearning, Beyond Binary Erasure: Soft-Weighted Unlearning for Fairness and Robustness, DynFrs: An Efficient Framework for Machine Unlearning in Random Forest, Influence Functions, and Certified Data Removal.

Synthetic dataedit

Synthetic Data studies recursive synthetic-data training, Data Selection, Sample Selection Bias, Model Collapse, and collaborative mitigation in low-resource data silos. The central paper is When Sample Selection Bias Precipitates Model Collapse, which frames model collapse as especially risky when real-data coverage is scarce or fragmented.

Data centric ML and trustworthy AIedit

Data Centric ML covers data selection, valuation, filtering, and evaluation. Trustworthy AI connects unlearning, fairness, robustness, privacy, security, interpretability, and reliability.

Geometry and distributed learningedit

Wasserstein Geometry, Distributed Wasserstein Barycenter, and Distributed Learning provide tools for collaborative evaluation, optimal-transport proxies, decentralized data access, and distributional references for networked AI systems.