spring 2023 Projects


Andrea Boskovic and Harshil Desai: NBA Analytics and Machine Learning

Student: Shubh Tandon
Slides | Writeup
Student: Haoquan Fang
Slides | Writeup
Prerequisites: Some experience in R or Python; some knowledge about basketball.

Have you ever wondered how to predict which NBA rookie will become an all star or wondered how teams choose which players to draft? In this project, we will explore NBA data to make a model that predicts something related to basketball. We will start with an introduction to basic machine learning models, learn how to implement models in R or Python, and evaluate the models we’ve created. Potential directions could include (but are definitely not limited to) ranking players based on box scores and advanced stats, predicting who will be the MVP, or predicting a team’s odds of making the playoffs in a given year. We are willing to mentor two students!



Antonio Olivas: Estimation of cancer screening models using deconvolution

Student: Yanting Hu
Slides | Writeup
Prerequisites: Calculus (MATH 126) and exposure to probability theory (STAT 340).

Cancer screening programs are an important component for secondary cancer prevention. To understand the conditions under which a cancer screening program provides the best benefit, mathematical models are used to estimate relevant quantities using information from cancer screening trials.

In the natural history of a cancer, the time to cancer onset (subclinical) and the sojourn/latent time (time between onset and clinical appearance) are two quantities of interest, but are impossible to know separately. However, by using a screening tool we obtain some information that allows us to differentiate between these two components. In this project we will study a mathematical model that uses information at the aggregated level from a cancer screening trial to estimate mean time to onset, mean sojourn time, and sensitivity of the screening test, via the deconvolution formula and maximum likelihood estimation.



Erin Lipman: Bayesian perspectives on probability and statistics

Student: Leila Peitsch
Slides | Writeup
Prerequisites: Probability at the level of 311, and some programming experience (preferably R)

Many of the methods we focus on in introductory statistics courses, for example confidence intervals and null hypothesis significance testing, come from the “Frequentist” philosophy of statistics which interprets probability as describing the relative frequency of a certain event over repeated trials (ex. if I flip a fair coin 100 times, about 50 of these flips will land on heads). “Bayesian” statistics on the other hand interprets probability as describing our belief and uncertainty about an event (ex. if I flip a coin once, it is equally likely to come up heads or tails). Because the Bayesian perspective views probability in terms of belief, it provides a rigorous framework for updating our belief in light of new data (ex. if I see that my coin lands on heads 100 out of 100 times, I might start to suspect that it is a fake coin where both sides are heads). In this DRP, we will learn how the Bayesian framework allows us to update our beliefs in light of new data and allows us to answer questions that we cannot answer within the frequentist perspective.



Ethan Ancell: Classical papers in statistics

Student: YoungMin (Janice) Kim
Slides | Writeup
Prerequisites: Students should have a good foundation in probability theory and some of the basic ideas in theoretical statistics (e.g., maximum likelihood estimation, central limit theorem, etc). Although we will be reading research articles, no prior experience in statistics research is necessary.

This directed reading project (DRP) will be a broad tour through some of the classical and influential research papers in the statistics and statistics adjacent literature (e.g., papers from authors like Fisher, Shannon, Pearson, Rao, Lehmann, Akaike). The mentor and mentee will read one research paper a week and meet to discuss the paper. As this project will be more of a literature review, breadth will be emphasized over depth in this DRP.



Rrita Zejnullahi: Decisions in risky situations

Student: Kreslyn Hinds
Slides | Writeup
Prerequisites: Knowledge of introductory statistics and calculus.

This project is about individual decision-making under uncertainty. We focus on the problem of how to optimally allocate a finite amount of resources across different locations/populations. While this decision task has wide application (e.g., the allocation of health care funds to different populations), we begin with the problem of reducing national poverty via subnational cash transfers. In the first half of the project, we will read papers on poverty and inequality. In the second half, we will conduct cognitive interviews as part of a think-aloud study at UW to map out the space of cognitive heuristics/problem-solving skills individuals use to allocate a fixed budget to different locations. By the end of the project, we will also identify areas that need development of new methods.



Vydhourie R T Thiyageswaran: Information flow and resistance in graphs

Prerequisites: Should be comfortable with linear algebra. For everything else, we can go over them together.

We will study information flow on networks by studying resistance in electrical networks. We will also look at research papers and their approach to maximizing information flow in this setting.