SDS Seminar Series – Dr. Purna Sarkar
Mar
29
2024
Mar
29
2024
Description
The Spring 2024 SDS Seminar Series continues on March 29th from 2:00 p.m. to 3:00 p.m. with Dr. Purna Sarkar (Statistics and Data Sciences, University of Texas at Austin). This event is in-person.
Title: Some New Results for Streaming Principal Component Analysis
Abstract: Streaming PCA, also known as Oja's algorithm, with roots going back to 1949, has attracted much attention in Statistics and Computer Science in the last decade. In this talk, I will discuss two of our works that consider this problem under slight departures from the setup considered widely in the literature.
Our first work looks at data streams generated from a Markov chain. While streaming PCA is typically analyzed under the IID data model, in many applications like distributed optimization, data points are sampled from a Markov chain and, therefore, are dependent. The naive approach of dropping data leads to a suboptimal rate. We use a novel linearization argument to remove the logarithmic dependence on the number of samples n.
Typically, the analysis of Oja's algorithm assumes that the effective rank of the covariance matrix is much smaller than n. Our second work examines online sparse PCA, where the effective rank is comparable to n. This differs from previously studied settings because the Oja vector does not concentrate on the true population eigenvector. Here, we show that a simple thresholding yields a consistent estimate of the population eigenvector. Both are joint works with Syamantak Kumar.
Location
Peter O’Donnell Jr. Building (POB) 2.302
Share
Other Events in This Series
Sep
8
2023
SDS Seminar Series – Dr. Emily Roberts
A Causal Inference Approach for Surrogate Marker Evaluation with Mixed Models
2:00 pm – 3:00 pm • In Person
Speaker(s): Emily Roberts
Sep
15
2023
SDS Seminar Series – Dr. Dimitris Korobilis
Monitoring Multicountry Macroeconomic Risk
2:00 pm – 3:00 pm • Virtual
Speaker(s): Dimitris Korobilis
Sep
22
2023
SDS Seminar Series – Dr. Will Fithian
Estimating the False Discovery Rate of Model Selection
2:00 pm – 3:00 pm • In Person
Speaker(s): Will Fithian
Sep
29
2023
SDS Seminar Series – Dr. David Moriarty
A Data Science Journey in Business
2:00 pm – 3:00 pm • In Person
Speaker(s): David Moriarty
Oct
6
2023
SDS Seminar Series – Dr. Amanda Ellis
Navigating the Future of Statistics Education: Leveraging ChatGPT's Advantages and Overcoming Challenges
2:00 pm – 3:00 pm • Virtual
Speaker(s): Amanda Ellis
Oct
20
2023
SDS Seminar Series – Dr. Amy Zhang
Bisimulation and Reinforcement Learning
2:00 pm – 3:00 pm • Virtual
Speaker(s): Amy Zhang
Oct
27
2023
SDS Seminar Series – Dr. Marcelo Medeiros
Global Inflation Forecasting: Benefits from Machine Learning Methods
2:00 pm – 3:00 pm • Virtual
Speaker(s): Marcelo Medeiros
Nov
3
2023
SDS Seminar Series - Dr. Steve Yadlowsky
Choosing a Proxy Metric from Past Experiments
2:00 pm – 3:00 pm • Virtual
Speaker(s): Steve Yadlowsky
Nov
10
2023
SDS Seminar Series – Drew Herren
Statistical Aspects of SHAP: Functional ANOVA for Model Interpretation
2:00 pm – 3:00 pm • In Person
Speaker(s): Drew Herren
Dec
1
2023
SDS Seminar Series – Dr. Dave Zhao
High-Dimensional Nonparametric Empirical Bayes Problems in Genomics
2:00 pm – 3:00 pm • In Person
Speaker(s): Dave Zhao