Joined 3/6/2013, 5:30:55 PM has 6226 karma
Hi, my name is Kyle Corbitt.
Currently I'm working on openpipe.ai. Previously worked at YC and Google.
personal site: corbt.com email: kyle@ above. I respond to emails.
Everything I know about reward hacking
Show HN: ART – a new open-source RL framework for training agents
ART·E: how we built an email research agent that beats o3
Using GRPO to Beat o1, o3-mini and R1 at “Temporal Clue”
Analyzing OpenAI's Reinforcement Fine-Tuning: Less Data, Better Results