SDS Seminar Series – Dr. Daniela Witten
Apr
12
2024
Apr
12
2024
Description
The Spring 2024 SDS Seminar Series continues on April 12th from 2:00 p.m. to 3:00 p.m. with Dr. Daniela Witten (Biostatistics and Statistics, University of Washington). This event is in-person.
Title: Data Thinning and Its Applications
Abstract: We propose data thinning, a new approach for splitting an observation from a known distributional family with unknown parameter(s) into two or more independent parts that sum to yield the original observation, and that follow the same distribution as the original observation, up to a (known) scaling of a parameter. This proposal is very general, and can be applied to a broad class of distributions within the natural exponential family, including the Gaussian, Poisson, negative binomial, Gamma, and binomial distributions, among others. Furthermore, we generalize data thinning to enable splitting an observation into two or more parts that can be combined to yield the original observation using an operation other than addition; this enables the application of data thinning far beyond the natural exponential family. Data thinning has a number of applications to model selection, evaluation, and inference. For instance, cross-validation via data thinning provides an attractive alternative to the "usual" approach of cross-validation via sample splitting, especially in unsupervised settings in which the latter is not applicable. We will present an application of data thinning to single-cell RNA-sequencing data, in a setting where sample splitting is not applicable. This is joint work with Anna Neufeld (Fred Hutch), Ameer Dharamshi (University of Washington), Lucy Gao (University of British Columbia), and Jacob Bien (University of Southern California).
Location
Peter O’Donnell Jr. Building (POB) 2.302
Share
Other Events in This Series
Mar
1
2024
SDS Seminar Series – Dr. Laura Hatfield
Predict, Correct, Select: A New General Identification Strategy for Controlled Pre-Post Designs
2:00 pm – 3:00 pm • Virtual
Speaker(s): Laura Hatfield
Mar
22
2024
SDS Seminar Series – Dr. Sivaraman Balakrishnan
Statistical Inference for Optimal Transport
2:00 pm – 3:00 pm • In Person
Speaker(s): Sivaraman Balakrishnan
Mar
29
2024
SDS Seminar Series – Dr. Purna Sarkar
Some New Results for Streaming Principal Component Analysis
2:00 pm – 3:00 pm • In Person
Speaker(s): Purna Sarkar
Apr
19
2024
SDS Seminar Series – Dr. William Rosenberger
Design and Inference for Enrichment Trials with a Continuous Biomarker
2:00 pm – 3:00 pm • In Person
Speaker(s): William Rosenberger
Apr
26
2024
SDS Seminar Series – Dr. Bodhisattva Sen
Extending the Scope of Nonparametric Empirical Bayes
2:00 pm – 3:00 pm • In Person
Speaker(s): Bodhisattva Sen
Sep
6
2024
SDS Seminar Series – Christine Peterson, University of Texas MD Anderson Cancer Center
New Methods for Microbiome Data Integration
2:00 pm – 3:00 pm • In Person
Speaker(s): Christine Peterson
Sep
13
2024
SDS Seminar Series – Matthew Vanaman, University of Texas at Austin
Data Analysis from the Zoo to the Wild and Back
2:00 pm – 3:00 pm • In Person
Speaker(s): Matthew Vanaman
Sep
20
2024
SDS Seminar Series – Saptarshi Roy, University of Texas at Austin
On the Computational Complexity of Private High-dimensional Model Selection
2:00 pm – 3:00 pm • In Person
Speaker(s): Saptarshi Roy
Sep
27
2024
SDS Seminar Series – Abhra Sarkar, University of Texas at Austin
(Bayesian) Semiparametric Local Inference (and Other Stories)
2:00 pm – 3:00 pm • In Person
Speaker(s): Abhra Sarkar
Oct
4
2024
SDS Seminar Series – Huiyan Sang, Texas A&M University
GS-BART: Graph Split Additive Decision Trees for Spatial and Network Data
2:00 pm – 3:00 pm • In Person
Speaker(s): Huiyan Sang