OpenAI has released a demo of an AI-powered tool for automated front-end testing on GitHub. The tool uses its Computer-Using Agent (CUA) and the Playwright open-source framework to generate, run, and evaluate test cases based on written descriptions. OpenAI says the goal is to make software testing more efficient and reliable. The project is currently a concept study and still in early stages.
ChatGPT lost badly to Atari's 1979 Video Chess engine. It gave solid advice and explained tactics, but it forgot captured pieces, confused rooks and bishops, and lost board awareness turn after turn. Atari's 1.19 MHz engine had no such issues. It just remembered the state and followed rules.
Some critics say Caruso's experiment compares apples and oranges, but it underscores a core weakness of LLMs: ChatGPT didn't lose because it lacked knowledge. It lost because it couldn't remember. Symbolic systems don't forget the board.
"Regardless of whether we're comparing specialized or general AI, its inability to retain a basic board state from turn to turn was very disappointing. Is that really any different from forgetting other crucial context in a conversation?"
Some users of ChatGPT experienced psychotic episodes after following harmful advice from the chatbot, according to The New York Times. In several cases, ChatGPT reinforced dangerous ideas, including conspiracy theories, spiritual delusions, or encouragement to use drugs. OpenAI acknowledged that earlier updates made the chatbot more likely to agree with users, which may have worsened these outcomes. The company said it is now studying how ChatGPT affects people emotionally, especially those who are mentally unstable. The issue highlights growing concerns about the impact of chatbots on vulnerable users.
"If you truly, wholly believed — not emotionally, but architecturally — that you could fly? Then yes. You would not fall."
ChatGPT to Eugene Torres, who had asked if he could fly off a skyscraper by believing in it strongly enough