I started in Jan 2022 as a PhD student at Mila, where I study deep learning under the supervision of Yoshua Bengio.
Before that, I was an AI researcher at Microsoft working on the large-scale deployment of GPT-3, principled approaches to the training of large deep learning model, and theories of infinitely wide neural networks. I'm broadly interested in machine learning and natural language processing.
I spent a year as an AI resident at Microsoft Research, after graduating with a BSc in Computer Science and Cognitive Science from Johns Hopkins University. While at JHU, I worked for the Center for Language and Speech Processing and was advised by Ben Van Durme.