Publications
(*) denotes equal contribution
2026
- arXivLikelihood Matching for Diffusion ModelsLei Qian, Wu Su, Yanqi Huang, and Song Xi Chen2026
We propose a Likelihood Matching approach for training diffusion models by first establishing an equivalence between the likelihood of the target data distribution and a likelihood along the sample path of the reverse diffusion. To efficiently compute the reverse sample likelihood, a quasi-likelihood is considered to approximate each reverse transition density by a Gaussian distribution with matched conditional mean and covariance, respectively. The score and Hessian functions for the diffusion generation are estimated by maximizing the quasi-likelihood, ensuring a consistent matching of both the first two transitional moments between every two time points. A stochastic sampler is introduced to facilitate computation that leverages both the estimated score and Hessian information. We establish consistency of the quasi-maximum likelihood estimation, and provide non-asymptotic convergence guarantees for the proposed sampler, quantifying the rates of the approximation errors due to the score and Hessian estimation, dimensionality, and the number of diffusion steps. Empirical and simulation evaluations demonstrate the effectiveness of the proposed Likelihood Matching and validate the theoretical results.
@article{qian2026LM, title = {Likelihood Matching for Diffusion Models}, author = {Qian, Lei and Su, Wu and Huang, Yanqi and Chen, Song Xi}, year = {2026}, eprint = {2508.03636}, archiveprefix = {arXiv}, primaryclass = {stat.ML}, }
2025
- arXivPartially Functional Dynamic Backdoor Diffusion-based Causal ModelXinwen Liu, Lei Qian, Song Xi Chen, and Niansheng Tang2025
Causal inference in settings involving complex spatio-temporal dependencies, such as environmental epidemiology, is challenging due to the presence of unmeasured confounding. However, a significant gap persists in existing methods: current diffusion-based causal models rely on restrictive assumptions of causal sufficiency or static confounding. To address this limitation, we introduce the Partially Functional Dynamic Backdoor Diffusion-based Causal Model (PFD-BDCM), a generative framework designed to bridge this gap. Our approach uniquely incorporates valid backdoor adjustments into the diffusion sampling mechanism to mitigate bias from unmeasured confounders. Specifically, it captures their intricate dynamics through region-specific structural equations and conditional autoregressive processes, and accommodates multi-resolution variables via functional data techniques. Furthermore, we provide theoretical guarantees by establishing error bounds for counterfactual estimates. Extensive experiments on synthetic data and a real-world air pollution case study confirm that PFD-BDCM outperforms current state-of-the-art methods.
@article{liu2026PFDBDCM, title = {Partially Functional Dynamic Backdoor Diffusion-based Causal Model}, author = {Liu, Xinwen and Qian, Lei and Chen, Song Xi and Tang, Niansheng}, year = {2025}, eprint = {2509.00472}, archiveprefix = {arXiv}, primaryclass = {stat.ML}, }
2023
- QTQMAn EWMA chart for high dimensional process with multi-class out-of-control information via random forest learningMingze Sun, Lei Qian, Amitava Mukherjee, and Dongdong XiangQuality Technology & Quantitative Management, 2023
Modern manufacturing and quality monitoring involve multi-class out-of-control (OOC) information from the training sample. It is essential to use such information during online monitoring of data streams from complex processes. In this paper, a monitoring framework is designed by combining the random forest technique with the exponentially weighted moving average method for monitoring complex processes with multi-class OOC information. To be specific, a process surveillance technique in the form of a control chart is proposed based on the probability that the online data is classified as an in-control (IC) sample, and the control chart triggers an alarm when the probability is lower than the control limit. Our numerical findings based on the Monte–Carlo simulation show that the proposed control chart performs more effectively than its competitors under various distributions and data types, especially for high-dimensional cases when multi-class OOC information is known in advance. Moreover, the proposed method is illustrated with an application using the data related to the hard disk manufacturing processes.
@article{sun2023EWMA, author = {Sun, Mingze and Qian, Lei and Mukherjee, Amitava and Xiang, Dongdong}, title = {An EWMA chart for high dimensional process with multi-class out-of-control information via random forest learning}, journal = {Quality Technology \& Quantitative Management}, volume = {0}, number = {0}, pages = {1-27}, year = {2023}, publisher = {Taylor & Francis}, doi = {10.1080/16843703.2023.2244213}, url = { https://doi.org/10.1080/16843703.2023.2244213}, eprint = {https://doi.org/10.1080/16843703.2023.2244213}, }