I am a postdoctoral researcher at UC Berkeley interested in algorithmic aspects of machine learning and artificial intelligence, with a focus on practical challenges such as scalability, effciency and social impact. I am holding an SNF Early Postdoc.Mobility fellowship and I am hosted by Prof. Moritz Hardt. Prior to that I worked as a research scientist at IBM Research Zurich and contributed to the development of the Snap ML library. I have obtained my PhD from ETH Zurich where I was affiliated with the Data Analytics Laboratory and supervised by Prof. Thomas Hofmann.
Machine learning is increasingly used to support consequential decisions. When informing decisions, predictions have the potential to change the way the broader system behaves. In particular, they can alter the data distribution the predictive model has been trained on -- a dynamic effect that traditional machine learning fails to account for. To address this, we introduce the framework of performative prediction to supervised learning [ICML'20]. We analyze the dynamics of retraining strategies in this setup and address challenges faced in stochastic optimization when the deployment of a model triggers performative effects in the data distribution it is being trained on [NeurIPS'20]. As a subfield of learning theory, performative prediction is only starting to receive attention from the community and there are many exciting, unexplored connections to questions in causality, control theory, economics, and sociology.
When training machine learning models in production, speed and efficiency are critical factors. Fast training times allow short development cycles, offer fast time-to-insight, and after all, save valuable resources. Our approach to achieving fast training is to enable the efficient use of modern hardware through novel algorithm design. In particular, we develop principled tools and methods for training machine learning models focusing on: compute parallelism [NeurIPS'19][ICML'20], hierarchical memory structures [HiPC'19][NeurIPS'17], accelerator units [FGCS'17] and interconnect bandwidth in distributed systems [ICML'18]. We demonstrated [NeurIPS'18] that such an approach can lead to several orders of magnitude reduction in training time compared to standard system-agnostic methods. The core innovations of this research have been integrated in the IBM Snap ML library and help diverse companies improve speed, efficiency and scalability of their machine learning workloads.