Button to scroll to the top of the page.


Campus health and safety are our top priorities. Get the latest from UT on COVID-19.

Get help with Zoom and more.

Fall 2020 Colloquium: Graduate Portfolio in Applied Statistical Modeling




Title and Zoom link

Derek Pisner

Dec 8

2:30 - 3:00pm

"Mining the Multiverse of the Ensemble Connectome"

Zoom link: https://utexas.zoom.us/j/98382184960

Meeting ID: 983 8218 4960

Isiana Rendon Dec 8 3:00 - 3:30pm

"Collaboration, Support Networks, and Retention"

Zoom link: https://utexas.zoom.us/j/98382184960

Meeting ID: 983 8218 4960

Ethan Tenison Dec 8 3:30 - 4:00 pm

"The Impact of Resource Diversification and Dependence on Nonprofit Financial Stability"

Zoom link: https://utexas.zoom.us/j/98382184960

Meeting ID: 983 8218 4960

Zihan Yang Dec 9 10:30 - 11:00am

"The Audience Participation of YouTube Health-related Videos: What Motivates the Audience Leave Positive Comments"

Zoom link: https://utexas.zoom.us/j/95390569126

Meeting ID: 953 9056 9126

Sina Rafati Dec 9 11:00 - 11:30am

"Lagrangian analysis of turbulent lifted non-premixed flame toward an understanding of flame-turbulence interaction"

Zoom link: https://utexas.zoom.us/j/95390569126

Meeting ID: 953 9056 9126

 Yuchen Xiao  Dec 9  11:30am - Noon

"Spatial Causal Inference with Raster-Based Methods"

Zoom link: https://utexas.zoom.us/j/95390569126

Meeting ID: 953 9056 9126


Derek Pisner

A common misconception is that there is one best way to model a connectome. By definition, however, the shape of a brain network will vary depending on how sparsely or resolutely a researcher chooses to model it. As emerging studies have shown, many of the methods that determine that shape -- metaparameters -- often lack clear defaults that can be hard-coded across analyses. If there was no "cap" on compute resources, such choices might even be re-learned for distinct brain networks, and perhaps further still to accommodate for alternative data modalities or demographic differences. The purpose of the present study is to show that at least some connectome re-learning can be efficiently performed across studies, and may ultimately be critical to establishing a clinically-useful connectomics altogether. The analyses that follow will attempt to optimize functional and structural connectomes for multiple brain networks of interest in clinical depression. The first is the so-called 'triple-network' (TN), which is relevant to self-referential attention biasing, and can be defined based on at least two distinct definitions: its intersection and its union. The second is a phonological Language network (LN) whose structural inefficiency might perseverate the TN's dysfunction to prolong depressive symptoms. Using a public test-retest dataset of 300 dMRI and rsfMRI scans, together with a novel connectome ensemble learning pipeline (PyNets), we reconstruct 3.6 million variations of these networks, which we then investigate in a Multiverse Analysis. In essence, each variation represents a single leaf of a connectome-generating decision-tree, with one and only one methodological variant from its neighbor. For each recipe of metaparameters, we benchmark reliability and discriminability, both globally and locally at each network node, and learn their decision sensitivities. Although most variations of the TN and LN were at least moderately reliable (ICC>0.75) and discriminable (mean D=0.80-0.90), clear reproducibility "gaps" also emerged, but their recipes were incongruous across even just the two TN definitions. Intersecting TN nodes along the cortical-midline showed consistently highest test-retest reliability, but lateral posterior nodes proved less stable across the multiverse of methods. Ultimately, discriminability was most sensitive to density-defining choices (e.g. smoothing kernel and node-fiber intersection distance), as well as attribute-assigning metaparameters like fiber-length in the case of structural connectomes, and oscillatory frequency-band in the case of functional. Despite the absence of universally optimal choices that consistently preserve connectome reproducibility for these networks, we delineate a set of optimization constraints that future clinical studies might use to learn more methodologically stable, individual-level connectotypes of depression.


Isiana Rendon 

Teacher turnover is a persistent and devastating problem, the effects of which affect schools, teachers, students, and parents. Teachers’ working conditions and especially their at-work relations influence teachers’ decisions to stay or leave the classroom. Using the Teaching and Learning International Survey (TALIS) 2013 and employing the use of both linear and logistic regression, this study seeks to uncover ties between teachers’ relative social/support network strengths and their quit-decisions. Findings include a “sweet spot” for teacher collaboration frequency and a shared sense of responsibility for their school’s success as factors in teachers’ overall job satisfaction. Implications and future research are discussed.


Ethan Tenison

 The nonprofit sector plays a crucial role in lives across the country, providing many essential social services to the general population. Nevertheless, many nonprofits struggle to continue their operations year after year, and many diversify their funding sources to insulate themselves from financial shocks. Nonprofit finance scholars have encouraged this strategy for decades because research has shown that stable and diverse funding sources promote financial stability and autonomous decision making. Previous studies have used logistic and proportional hazard regression to estimate the impact of resource diversification on financial stability, using complex financial ratios and nonprofit categorical data obtained from IRS 990 forms. They have been far from complete, however, because the IRS does not publish the bulk data in a human readable format. As a result, many of these studies have only used a small subset of nonprofit 990 forms. For this project, Amazon Web Services and Python were used to collect and interpret all of the e-filed 990 forms for the years 2015-2018. A natural language processing algorithm, called BERT, was also used to classify nonprofits based on their mission statements, providing detail that has never been seen before. Two different types of logistic regression models were used to predict whether nonprofits would be financially vulnerable in the future. The dependent variable was whether nonprofits saw three consecutive declines in the ratio of program expenses to total expenses. The reason for this being that nonprofits are focused on providing services and not on profit like regular businesses. The first model uses resource diversification as an independent variable, and the second model breaks resource diversification into its constituent components, to see if the reliance on certain funding sources had more or less of an impact on financial vulnerability. The results show that resource diversification does in fact decrease the probability of being financially vulnerable for some types of nonprofits. In addition, the second model showed that dependence on grants, on average, had a positive impact on financial vulnerability, whereas reliance on membership dues had the opposite effect. Over all, there were major differences between the nonprofit categories, but the conclusions should be taken with a grain of salt. More research needs to be done to redefine the dependent variable as financial ratios can be inaccurate and difficult to interpret.


Zihan Yang

 With the growth of personal health habits pursuit, health-related videos including a healthy diet, fitness, and exercise have become a large category attracting a majority of people. According to the previous study, video’s audience participation refers to how audiences engage in the video, including liking, disliking, commenting, sharing, and uploading videos. Among all the audience participation actions, the comment is the most complex one. That is because the audience could not only leave comments under vlogs but also read them. This study will focus on how audience participation actions of YouTube health-related videos influence the emotional style of the comments. API tool and text analysis application LWIC2015 will be used to collect and analyze health-related vlogs’ comments data. Two linguistic styles of comments (analytic and clout), whether the video is sponsored or not, and the audience participation actions are included as multilevel variables in the dataset. Multilevel modeling analysis reveals that the video with more externally focused comments will make the audience leave more positive comments. The comments with an external and narrative linguistic style are more likely to be positive, and whether the video is sponsored or not won’t make any change to this relationship. However, the number of opinion leaders in the comments will moderate the relationship between comments’ emotions and the level of analytical thought.


 Sina Rafati

Non-premixed turbulent lifted jet flame is studied from both Lagrangian and Eulerian points of view to understand flame-turbulence interaction better. High-speed planar particle image velocimetry (PIV) was used to measure streamwise and transverse velocity components. The field of view was selected, such that it contains the flame base. The flame front was resolved using the abrupt change in the PIV particles' number density as the particle density correlates with the gas density inversely proportional to the gas temperature. Lagrangian Coherent Structures (LCSs), defined as ridges of Finite-Time Lyapunov Exponent (FTLE) fields, were utilized to resolve invariant material manifolds. The vorticity field is resolved to investigate its association with LCSs. The Lagrangian framework is shown to be more effective than Eulerian analysis in capturing the flame-turbulence transport geometry and their interaction. The manifolds' time history represents how eddy structures interact with the flame and highlight their role in stabilizing the lifted jet flames with local Reynolds numbers (Reδ) of 3500 and 5000. It is shown that the flow features due to stable manifolds can create a premixed zone that opposes flame propagation. While the presence of unstable manifolds breaks the balance between the gas velocity and flame propagation speed, causing the flame to become unstable and move downstream. Besides, the flame's propagation toward upstream has shown to be through holes in the stable manifolds. It is also observed that the maximum principal compressive strain rate corresponds to the repelling LCSs.


Yuchen Xiao

We introduce potential outcomes as a basic framework to causal inference methodology and illustrate how causal inference provides new estimand of interests that expand traditional associational inference interpretations in earth science problems. Two basic causal inference methods for completely randomized experiments, Fisher’s Exact Test and Neyman’s Repeated Sampling, are depolyed to investigate how saltwater disposals (SWDs) in the Fort-Worth basin of North-Central Texas causally affect the induced seismicity residing in immediate surrounding area. The study regions are discretized into independent and non-overlapping grid blocks. Fisher's Exact Test with a sharp null hypothesis establishes a formal causal link between earthquakes and the presence of SWDs in grid blocks for the entire study region. The Neyman's Repeated Sampling estimates the average increase in earthquake occurrences for grid blocks that have SWDs to those absent of SWDs. Two important contributions are added to existing literature: (1) we fully exploit the variation of grid size and grid offset cast on the results obtained from statistical analysis; (2) our novel hierarchical design enables generating thousands of synthetic data while preserving spatial correlation. This essentially replaces the resampling technique which often leads to biased null distributions resulting in misinterpretations of test statistics. Our results show there is a strong causal link between earthquake occurrences and the presence of SWDs in grid blocks for the entire study region over all ranges of grid size and grid offsets. The results obtained from Neyman's Repeated Sampling further indicate the presence of SWDs in grid blocks, on average, causes 30 more earthquakes for a range of grid block sizes.