SWE-agent gets 12% on 100% of SWE bench. It uses GPT-4 to help it write software and solve PRs.
Wednesday, April 3, 2024SWE-agent turns LLMs (e.g. GPT-4) into software engineering agents that can fix bugs and issues in real GitHub repositories. SWE-agent sets the state-of-the-art performance on the full SWE-bench benchmark, resolving 12.29% of issues.