SWE-smith breaks tests in repos to synthesize realistic tasks with environments and validation. It improves coverage for training and evals.
- 50k instances across 128 GitHub repos.
- Targets environment setup + patch validation bottlenecks.
- Boosts open models toward production-grade behavior.