Research

AI Safety & Evaluations

Current role: Member of Technical Staff at P-1 AI, working as an AI eval research engineer specializing in LLM-based agents.

Previous: Algoverse AI Safety Fellowship — Evaluating agents on long-horizon tasks.

LLM agents are increasingly deployed to carry out complex, multi-step tasks on behalf of users. During this process, are agents able to retain their alignment training, remember the original goal, and adapt to unexpected changes in the environment?

Astrophysics

Publications

Google Scholar · NASA ADS
37 peer-reviewed papers, 5 first-author
600+ citations · H-index: 12

Ph.D. Thesis: The Dragonfly Ultrawide Survey

Shen, Z. et al. 2024, The Astrophysical Journal, 976, 75

Our research group built the Dragonfly Telephoto Array to detect very large and diffuse galaxies that are missed by conventional telescopes.
We used this telescope to map 10,000 square degrees of the northern sky.
I led the science analysis of the entire dataset to find new galaxies.
I built a custom data pipeline on AWS and secured $10K in funding to deploy it.
We found 11 large, low surface brightness galaxies with spectroscopic confirmation.

Press & Outreach

Won third place at the Yale Three-Minute Thesis Competition, competing against Ph.D. students from across Yale.
Used Hubble Space Telescope data to measure the distance to a galaxy lacking dark matter. Results reported by STScI, Yale News, and IAS.

AI Safety & Evaluations#

Astrophysics#

AI Safety & Evaluations

Astrophysics