Mentors and Project Descriptions
autumn 2025 Projects
Andrew Zhang: Functional Inequalities
If X is an n-dimensional random vector, it is natural to ask how real-valued functions of X concentrate around their expectation E[f(X)]. If X is uniformly distributed on a bounded open set with a smooth boundary, then classical results such as the Poincaré inequality provide a bound on the variance of f(X) that depends on the ambient dimension n. Remarkably, there are random variables which exhibit dimension free concentration properties such as a standard Gaussian vector through the Gaussian Poincaré inequality. In this DRP, we will do a guided reading of such functional inequalities, their relationship to Markov diffusions, and their implications for the theory of sampling.
Ethan Ancell and Kayla Irish: Multiple Testing
When an analyst considers multiple statistical inferences at once (for example, by conducting many hypothesis tests simultaneously), standard statistical guarantees such as bounds on type 1 error break down. The field of multiple testing addresses this problem by adjusting p-values as appropriate to yield bounds on metrics such as the family-wise error rate (FWER) or the false discovery rate (FDR). This DRP will have an emphasis on reading modern research papers from the field of multiple testing, with opportunities to code simulations as the mentees desire.
Hansen Zhang: Martingales and the Monkey Problem
In this project, students will explore a famous probability puzzle: If a monkey types letters at random, how long will it take to produce the word “ABRACADABRA”? This deceptively simple question reveals rich mathematics. Students will model the problem using a Markov chain, where states track partial progress toward the target word, and compute the expected hitting time to the absorbing state. They will then connect the analysis to martingales, showing how fairness conditions and stopping times can provide alternative tools for reasoning about waiting times. The project emphasizes problem-solving, probabilistic modeling, and connections to real-world applications such as DNA sequence matching and information theory, which we may explore further given sufficient time.
Patrick Campbell: Introduction to Statistical Learning
This project will explore the various methods and models relating to statistical learning, an evermore popular subject with various applications in medicine, business, and more. The reading project will explore various topics like linear/logistic regression, resampling methods, model selection, and more. We will mainly refer to the freely available textbooks “An Introduction to Statistical Learning with R/Python.” The mentee will explore and ultimately develop their own model(s) for an application/dataset of their interest. Prerequisites: Stat 311 or equivalent required, coding experience in R/Python preferred, knowledge of linear algebra (i.e. Math 208) helpful but not required.
Ronan Perry: Patterns, Predictions, and Actions
This will be a guided reading of Patterns, Predictions, and Actions. We will study the problem of making predictions: evaluation of predictions, optimization of actions using observed data, generalization to unobserved data, and more. We will work through small problems and implement numeric solutions.
Rui Wang: Introduction to Causal Inference
This project focuses on the fundamental concepts and methods of causal inference. The tentative content includes the potential outcome framework, causal graphical models, randomized trials, and observational studies. The specific content will be adjusted according to the mentee’s background and interests.
Yeting Wu: Analysis of Variance
This project will explore how visual cues, such as profile photos, influence the formation of online friendships on social networking sites like Facebook. We will discuss theories from computer-mediated communication, including the hyperpersonal model and physical attractiveness stereotypes, and examine how statistical methods like factorial experimental design and ANOVA are used to study impression formation. The mentee will read and analyze research on online self-presentation, design choices in experiments, and how data are interpreted to draw social and behavioral insights.
Yuhan Qian: Introduction to Gaussian Processes
Everyone knows about the Gaussian distribution. Its infinite-dimensional generalization, the Gaussian Process (GP), is also a commonly used tool in supervised machine learning, widely applied in regression and classification tasks. In this project, we will begin by exploring fundamental mathematical concepts and the standard GP model. We will also apply GP to solve some interesting problems in clinical trials.