Research Projects
DSPy
Accepted to the Twelfth International Conference on Learning Representations (ICLR 2024) Accepted to the NeurIPS 2023 Workshop R0-FoMo: Robustness of Few-shot and Zero-shot Learning in Large Foundation Models
DSPy is a groundbreaking framework designed to address advanced tasks using language models (LMs) and retrieval models (RMs). With DSPy, programming is emphasized over traditional prompting techniques as you can express any pipeline using clean, Pythonic control flow and incorporate models like GPT-3.5, Llama2, and T5. DSPy introduces a revolutionary perspective that unifies prompting, finetuning, reasoning, and retrieval augmentation, providing a comprehensive solution to LM-based task solving. Drawing inspiration from PyTorch, DSPy allows users to declare needed modules alongside their functionality, rather than focusing on implementation details. The DSPy compiler then automatically traces the modularized format, training LMs to execute declarative steps with optimized prompts for the LMs. Signatures allow for this declarative specification of input/output behavior in DSPy modules, allowing for intuitive communication of clear instructions to LMs, while Teleprompters optimize the performance of LMs, learning from demonstrations to generate effective prompts for desired tasks. In essence, DSPy provides a novel framework that streamlines LM-based task solving from concept to completion, representing a paradigm shift in language model prompting.
I’ve had the great pleasure of working under the mentorship of Omar Khattab and the DSPy team at Stanford in bringing DSPy to life.
Want to learn more? Check out these resources:
- arXiv: Explore our paper on arXiv.
- GitHub: Explore the code and project details on GitHub.
- Twitter/X Thread: Follow the discussion and updates on the release thread.
PORT
Under review.
PORT (Private Optimal Runtime Training) is a solver framework that optimizes network runtime and optimizes RAM usage for training models on memory-constrained devices. By leveraging unified memory of device accelerators, PORT refines network layers and frames training as an ILP, enabling dynamic selection of the best accelerator for each network node and leading to substantial runtime enhancements across diverse devices.
I’ve had the great pleasure of working under the mentorship of Shishir Patil and the research team at the Sky Lab in UC Berkeley.
Quadrotor Drone Stability (in-progress)
This project involves training quadrotor drones to stabilize and reach target positions while simulating under environments where one, two or three propellors experience failure.
I’ve had the great pleasure of working under the mentorship of Zhongyu Li and the research team at the Hybrid Robotics group in UC Berkeley.