Opus 4.5 pairs high pass rates with an "effort" knob to trade cost for deeper reasoning. It also reduces output tokens, lowering cost per successful fix.
Implication: Benchmarks are converging on real repo competence, not just completion quality.
Opus 4.5 hits ~80.9% on SWE-bench Verified, showing agent strength plus better cost-per-fix knobs.
Opus 4.5 pairs high pass rates with an "effort" knob to trade cost for deeper reasoning. It also reduces output tokens, lowering cost per successful fix.
Implication: Benchmarks are converging on real repo competence, not just completion quality.
Explore more in-depth guides and comparisons in our Knowledge Hub.