SDS Seminar Series – Lydia Lucchesi, University of Texas at Austin
Sep
12
2025

Sep
12
2025
Description
The Fall 2025 SDS Seminar Series continues on September 12th from 2:00 p.m. to 3:00 p.m. with Dr. Lydia Lucchesi (Postdoctoral Fellow, Department of Statistics and Data Sciences, University of Texas at Austin). This event is in-person in the Avaya Room (POB 2.302).
Title: Visual Documentation for Data Preprocessing in R and Python
Abstract: Data preprocessing is a crucial intermediate stage in many data analyses but is often overlooked in the documentation and dissemination of research. This talk introduces the smallsets R package for building Smallset Timelines, a static and compact visualization for communicating data preprocessing decisions. The Smallset Timeline is composed of small dataset snapshots documenting the sequence of decisions in a preprocessing pipeline. The smallsets R package builds this figure from a user’s R or Python preprocessing script, containing structured comments with snapshot instructions. This talk also presents findings from a focus group study that gathered feedback from prospective smallsets users on the package’s utility and usability. The feedback will be used to inform future software development efforts for smallsets.
Other Events in This Series
Sep
5
2025
SDS Seminar Series – Sarah Coleman, University of Texas at Austin
A Linear Mixed Effects Model for Evaluating Synthetic Gene Circuits
2:00 pm – 3:00 pm • In Person
Speaker(s): Sarah Coleman
Sep
19
2025
SDS Seminar Series – Tuan Pham, University of Texas at Austin
Time-uniform Bounds for Iterated Algorithms
2:00 pm – 3:00 pm • In Person
Speaker(s): Tuan Pham
Sep
26
2025
SDS Seminar Series - Ryan Giordano, University of California, Berkeley
Local Weighting--Based Diagnostics for Bayesian Multilevel Regression with Poststratification
2:00 pm – 3:00 pm • In Person
Speaker(s): Ryan Giordano
Oct
3
2025
SDS Seminar Series – Rafael Campello de Alcantara, University of Texas at Austin
Searching for Parallel Trends: A Decision Tree Algorithm for Discovering Conditional Diff-in-Diff Estimators
2:00 pm – 3:00 pm • In Person
Speaker(s): Rafael Campello de Alcantara
Oct
10
2025
SDS Seminar Series – Michele Guindani, University of California, Los Angeles
TBA
2:00 pm – 3:00 pm • In Person
Speaker(s): Michele Guindani
Oct
17
2025
SDS Seminar Series – Wenyi Wang, MD Anderson Cancer Center
Deciphering Tumor Heterogeneity for Benefits from Immunotherapy in Cancer
2:00 pm – 3:00 pm • In Person
Speaker(s): Wenyi Wang
Oct
24
2025
SDS Seminar Series - Jonathan Huggins, Boston University
Robust Model Selection for Discovery of Latent Mechanistic Processes
2:00 pm – 3:00 pm • In Person
Speaker(s): Jonathan Huggins
Oct
31
2025
SDS Seminar Series – Max Goplerud, University of Texas at Austin
TBA
2:00 pm – 3:00 pm • In Person
Speaker(s): Max Goplerud
Nov
7
2025
SDS Seminar Series – Jeffrey Miller, Harvard University
TBA
2:00 pm – 3:00 pm • In Person
Speaker(s): Jeffrey Miller