Content
summary Summary

AI researchers unveil an AI agent that can program better than GPT-4, putting the open-source project in competition with offerings like the recently launched Devin AI programmer.

The Princeton NLP team has developed SWE-agent, an open-source system that converts language models like GPT-4 into software engineering agents. These agents can fix bugs and problems in real-world GitHub repositories. With 12.29% of issues solved in the SWE-Bench test set, SWE-agent nearly matches the recently released commercial Devin (13.86%).

The result is a special Agent Computer Interface (ACI) that allows the language model to browse the repository, view, edit, and execute code files. The ACI development includes features such as a linter for syntax checking, a special file viewer, and directory browsing.

Cognition AI raised 21 million US dollars for Devin

SWE-agent plays a similar role to Devin, an AI software developer developed by AI startup Cognition AI. Devin can collaborate with human developers, perform tasks independently, and submit them for review. Devin is also designed to handle new, unknown libraries, program entire applications, find bugs in code bases, and process bug reports and feature requests in open-source repositories.

Ad
Ad

Unlike SWE-agent, Devin is not yet publicly available and has only been made available to select developers via a waiting list. Cognition AI recently closed a $21 million Series A funding round.

Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Researchers at Princeton University have developed SWE-agent, an open-source system that converts language models such as GPT-4 into software engineering agents that can fix bugs in real-world GitHub repositories.
  • SWE-agent achieves a similar result in the SWE-Bench test set, with 12.29% of problems solved, as the recently introduced commercial AI programmer Devin from Cognition AI, with 13.86%.
  • Devin is not yet publicly available, while the Princeton team has released SWE-agent and is looking for input for further development.
Sources
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.