SDS Seminar Series - Jonathan Huggins, Boston University
Oct
24
2025
Oct
24
2025
Description
The Fall 2025 SDS Seminar Series continues on October 24th from 2:00 p.m. to 3:00 p.m. with Dr. Jonathan Huggins (Assistant Professor, Department of Mathematics & Statistics, Boston University). This event is in-person in POB 6.304.
Title: Robust Model Selection for Discovery of Latent Mechanistic Processes
Abstract: When learning interpretable latent structures using model-based approaches, even small deviations from modeling assumptions can lead to inferential results that are not mechanistically meaningful. For example, many latent structures consist of K mechanistic processes (with K unknown). When the model is misspecified, likelihood-based model selection methods can substantially overestimate K as the sample size grows, while nonparametric methods can be overly conservative no matter how large the sample size. Hence, there is need for model selection methods that combine the precision of likelihood-based approaches with the robustness of nonparametrics. To address this need in a principled manner, we first formalize the problem of robust model selection in latent variable models designed for mechanistic understanding as requiring an estimator for K to satisfy a robust model selection consistency property. The definition of robust model selection consistency motivates a particular family of model selection procedures, which rely on plug-in estimates of a component-wise discrepancy measure we call the accumulated cutoff discrepancy criterion (ACDC). We provide a method for constructing mechanistically meaningful component-wise discrepancies for a class of latent variable models that includes unsupervised and supervised variants of probabilistic matrix factorization (including factor analysis) and mixture models. We prove that ACDC provides robust model selection consistency for unsupervised matrix factorization and mixture models. Numerical results show that in practice our approach reliably identifies a physically meaningful number of latent processes in four illustrative applications, outperforming widely used model selection methods. An in-depth case study of cell type discovery using single-cell RNA sequencing data demonstrates ACDC outperforms two widely used software packages designed specifically for single-cell data analysis.
Other Events in This Series
Oct
4
2024
SDS Seminar Series – Huiyan Sang, Texas A&M University
GS-BART: Graph Split Additive Decision Trees for Spatial and Network Data
2:00 pm – 3:00 pm • In Person
Speaker(s): Huiyan Sang
Oct
11
2024
SDS Seminar Series – Mingyuan Zhou, University of Texas at Austin
Building Faster, Better, and Safer Deep Generative Models via Score Identity Distillation
2:00 pm – 3:00 pm • In Person
Speaker(s): Mingyuan Zhou
Oct
18
2024
SDS Seminar Series – Sherry Zhang, University of Texas at Austin
Pivoting between Space and Time: Spatio-Temporal Analysis with Cubble
2:00 pm – 3:00 pm • In Person
Speaker(s): Sherry Zhang
Oct
25
2024
SDS Seminar Series – Matt Koslovsky, Colorado State University
Sparse Dirichlet-Multinomial Models
2:00 pm – 3:00 pm • In Person
Speaker(s): Matt Koslovsky
Nov
1
2024
SDS Seminar Series – Aaditya Ramdas, Carnegie Mellon University
A Game-Theoretic Theory of Statistical Evidence
2:00 pm – 3:00 pm • In Person
Speaker(s): Aaditya Ramdas
Nov
8
2024
SDS Seminar Series – Myungsoo Yoo, University of Texas at Austin
Dynamic Spatio-Temporal Model Integrating Physics for Fire Front Propagation
2:00 pm – 3:00 pm • In Person
Speaker(s): Myungsoo Yoo
Nov
15
2024
SDS Seminar Series – Rafael Irizarry, Harvard University
Twenty-Five Years of Data Science: Music, Genomics, and Public Health Surveillance
2:00 pm – 3:00 pm • In Person
Speaker(s): Rafael Irizarry
Mar
7
2025
SDS Seminar Series - Arun Kuchibhotla, Carnegie Mellon University
Adaptive Inference Techniques for Some Irregular Problems
2:00 pm – 3:00 pm • In Person
Speaker(s): Arun Kuchibhotla
Mar
28
2025
SDS Seminar Series – Po-Ling Loh, University of Cambridge
Differentially Private M-estimation via Noisy Optimization
2:00 pm – 3:00 pm • In Person
Speaker(s): Po-Ling Loh
Apr
18
2025
SDS Seminar Series – Richard Samworth, University of Cambridge
How Should We Do Linear Regression?
2:00 pm – 3:00 pm • In Person
Speaker(s): Richard Samworth