Logistic PCA and Generalized PCA
My dissertation research with Prof. Yoonkyung Lee deals with dimensionality reduction of binary and count data. We propose a generalization of principal component analysis to non-Gaussian data. Our method minimizes the deviance by solving for a projection matrix which projects the natural parameters of the saturated model onto a lower dimensional space. Two preprint articles are available here. An R package implementing this research for binary data is available on CRAN. A complementary R package for all types of data is available on Github. For this research, I won the department’s Whitney Award for Outstanding Thesis Researcher.
Origin-Destination Estimation on Bus Routes
Given passenger boarding and alighting counts which are automatically generated by buses, we have been developing methods to improve origin-destination (OD) flow estimation. Our method uses variational Bayes to estimate the posterior of the OD flows. When multiple patterns may be occurring during a period, we can determine the number of patterns and cluster bus trips, which leads to increased accuracy. Two papers are in preparation and an R package is in early development. For my work at the transit lab, I was awarded the Whitney Award for best Research Associate.
Predicting Student Enrollment
During my summer at Data Science for Social Good, our team worked with Chicago Public Schools to more accurately predict the number of students enrolling at each of the system’s 600 schools. A blog post describing the problem is here and the Github repository of some of our solutions is here.
Public Transportation and GHG
We broadly assessed the impact that public transportation has on reducing greenhouse gas (GHG) emissions in US cities. Two papers have been published based on this research. The first describes the data collection process and initial results. The second details the model we built which accounts for potential biases.
Capital One Student Modeling Competition
I was part of the team that won the 2013 Capital One Student Modeling Competition. The task was to build a recommender system to offer the most relevant coupons to customers. Our solution extended the matrix factorization techniques that were used in the Netflix Prize.