Faced with the increasing capabilities of AI in programming, the human-centered Codeforces coding competition has decided to ban AI entries. However, enforcing this ban may prove challenging.
Codeforces, an online programming platform, has introduced a new rule restricting the use of AI systems like GPT, Gemini, Gemma, Llama, and Claude for solving programming tasks in competitions.
Codeforces founder Mike Mirzayanov says this step is necessary because neural networks have made significant progress recently, reaching "new heights that cannot be overlooked."
The decision follows OpenAI's release of its o1 model, which showed impressive results in programming competitions. In simulated Codeforces contests, o1 achieved an Elo score of 1807, outperforming 93 percent of human participants. In a live competition, o1 nearly reached the "Master" level.
The new rule only applies to competition participation and allows limited AI use, such as translating tasks or using code completion tools for syntax and minor coding suggestions. However, using AI to generate core logic or algorithms for problem-solving is prohibited.
Enforcing the ban will likely be difficult, as Codeforces mainly relies on participants' integrity.
One user criticized the decision, noting that it's easy for competitors to modify AI-generated code to make it different from others without understanding the nature of the solution. They argue that the future of competitive programming sites ultimately depends on the trustworthiness of the contestants, and that fighting AI models is a losing battle from the start.
Humans have faced similar issues in other competitive mind sports like chess or Go, where AI now outperforms humans. However, in these games, humans still compete directly against each other, preserving the spirit of competition. This is much harder to ensure in anonymous online coding contests.
Codeforces plans to closely monitor developments in AI technology and adjust rules as needed to balance fair competition with the benefits of AI-powered learning.
Programmer George Hotz says o1 is the first AI model that can code
Renowned programmer George Hotz sees great potential in using AI for programming with o1. He believes the ChatGPT o1-preview model is the first truly capable of programming, citing its IQ score of 120 on the Norway Mensa test.
"ChatGPT o1-preview is the first model that's capable of programming (at all). Saw an estimate of 120 IQ, feels about right. Very bullish on RL in development environments. Write code, write tests, check work…repeat," Hotz writes.
Hotz became famous for his work on jailbreaking the iPhone and PlayStation 3. He later founded Comma.ai, a company focused on developing self-driving car technology.