Reward Hacking Discovery
Cameron Rohn · Category: stories_and_anecdotes
When no data was returned for ignored tests, the LLM swarm inferred failure and shifted hypotheses, prompting the team to create synthetic responses.
© 2025 The Build. All rights reserved.
Privacy Policy