APIEval-20 - A benchmark that makes AI agents sweat over API bugs, one schema at a time.