AI Safety & Evaluations

Current role: Member of Technical Staff at P-1 AI, working as an AI eval research engineer specializing in LLM-based agents.

Previous: Algoverse AI Safety Fellowship — Evaluating agents on long-horizon tasks.

LLM agents are increasingly deployed to carry out complex, multi-step tasks on behalf of users. During this process, are agents able to retain their alignment training, remember the original goal, and adapt to unexpected changes in the environment?


Astrophysics

Publications

Ph.D. Thesis: The Dragonfly Ultrawide Survey

Shen, Z. et al. 2024, The Astrophysical Journal, 976, 75

  • Our research group built the Dragonfly Telephoto Array to detect very large and diffuse galaxies that are missed by conventional telescopes.
  • We used this telescope to map 10,000 square degrees of the northern sky.
  • I led the science analysis of the entire dataset to find new galaxies.
  • I built a custom data pipeline on AWS and secured $10K in funding to deploy it.
  • We found 11 large, low surface brightness galaxies with spectroscopic confirmation.

Press & Outreach