SDS Seminar Series - Jonathan Huggins, Boston University
Oct
24
2025
Oct
24
2025
Description
The Fall 2025 SDS Seminar Series continues on October 24th from 2:00 p.m. to 3:00 p.m. with Dr. Jonathan Huggins (Assistant Professor, Department of Mathematics & Statistics, Boston University). This event is in-person in POB 6.304.
Title: Robust Model Selection for Discovery of Latent Mechanistic Processes
Abstract: When learning interpretable latent structures using model-based approaches, even small deviations from modeling assumptions can lead to inferential results that are not mechanistically meaningful. For example, many latent structures consist of K mechanistic processes (with K unknown). When the model is misspecified, likelihood-based model selection methods can substantially overestimate K as the sample size grows, while nonparametric methods can be overly conservative no matter how large the sample size. Hence, there is need for model selection methods that combine the precision of likelihood-based approaches with the robustness of nonparametrics. To address this need in a principled manner, we first formalize the problem of robust model selection in latent variable models designed for mechanistic understanding as requiring an estimator for K to satisfy a robust model selection consistency property. The definition of robust model selection consistency motivates a particular family of model selection procedures, which rely on plug-in estimates of a component-wise discrepancy measure we call the accumulated cutoff discrepancy criterion (ACDC). We provide a method for constructing mechanistically meaningful component-wise discrepancies for a class of latent variable models that includes unsupervised and supervised variants of probabilistic matrix factorization (including factor analysis) and mixture models. We prove that ACDC provides robust model selection consistency for unsupervised matrix factorization and mixture models. Numerical results show that in practice our approach reliably identifies a physically meaningful number of latent processes in four illustrative applications, outperforming widely used model selection methods. An in-depth case study of cell type discovery using single-cell RNA sequencing data demonstrates ACDC outperforms two widely used software packages designed specifically for single-cell data analysis.
Other Events in This Series
Mar
1
2024
SDS Seminar Series – Dr. Laura Hatfield
Predict, Correct, Select: A New General Identification Strategy for Controlled Pre-Post Designs
2:00 pm – 3:00 pm • Virtual
Speaker(s): Laura Hatfield
Mar
22
2024
SDS Seminar Series – Dr. Sivaraman Balakrishnan
Statistical Inference for Optimal Transport
2:00 pm – 3:00 pm • In Person
Speaker(s): Sivaraman Balakrishnan
Mar
29
2024
SDS Seminar Series – Dr. Purna Sarkar
Some New Results for Streaming Principal Component Analysis
2:00 pm – 3:00 pm • In Person
Speaker(s): Purna Sarkar
Apr
12
2024
SDS Seminar Series – Dr. Daniela Witten
Data Thinning and Its Applications
2:00 pm – 3:00 pm • In Person
Apr
19
2024
SDS Seminar Series – Dr. William Rosenberger
Design and Inference for Enrichment Trials with a Continuous Biomarker
2:00 pm – 3:00 pm • In Person
Speaker(s): William Rosenberger
Apr
26
2024
SDS Seminar Series – Dr. Bodhisattva Sen
Extending the Scope of Nonparametric Empirical Bayes
2:00 pm – 3:00 pm • In Person
Speaker(s): Bodhisattva Sen
Sep
6
2024
SDS Seminar Series – Christine Peterson, University of Texas MD Anderson Cancer Center
New Methods for Microbiome Data Integration
2:00 pm – 3:00 pm • In Person
Speaker(s): Christine Peterson
Sep
13
2024
SDS Seminar Series – Matthew Vanaman, University of Texas at Austin
Data Analysis from the Zoo to the Wild and Back
2:00 pm – 3:00 pm • In Person
Speaker(s): Matthew Vanaman
Sep
20
2024
SDS Seminar Series – Saptarshi Roy, University of Texas at Austin
On the Computational Complexity of Private High-dimensional Model Selection
2:00 pm – 3:00 pm • In Person
Speaker(s): Saptarshi Roy
Sep
27
2024
SDS Seminar Series – Abhra Sarkar, University of Texas at Austin
(Bayesian) Semiparametric Local Inference (and Other Stories)
2:00 pm – 3:00 pm • In Person
Speaker(s): Abhra Sarkar