Ad
Skip to content
Read full article about: Claude's Cowork desktop app now runs scheduled tasks so your AI assistant works while you sleep

Anthropic's AI assistant Claude is picking up new features in its desktop app Cowork. Users can now set up scheduled tasks that Claude handles automatically at set times, things like a morning briefing, weekly spreadsheet updates, or Friday presentations for the team.

Anthropic also points to the plugins already available that give Cowork specialized knowledge in areas like design, technology, and law. A full overview of available plugins is here. Moreover, there's a new "Customize" section in Cowork's sidebar where users can manage all their plugins, skills, and connections from one place.

Cowork is available as a research preview for macOS and Windows, open to all paying Claude subscribers. As with any agent-based AI system, there are cybersecurity considerations. It's worth being careful about which parts of your computer you give the software access to.

Read full article about: Anthropic acquires Vercept to give Claude sharper eyes for reading and controlling computer screens

Anthropic has acquired AI startup Vercept to boost Claude's computer use capabilities. Vercept built AI that works directly on a user's machine, understands screen content, and executes tasks. Founders Kiana Ehsani, Luca Weihs, and Ross Girshick are joining Anthropic with their team. The acquisition price hasn't been disclosed.

Vercept solves perception and interaction problems central to AI-driven computer use, according to Anthropic. The technology lets an AI model read and operate human-designed interfaces from screenshots without needing a dedicated programming interface (API).

Vercept will shut down its desktop AI agent "Vy" in the coming weeks. What likely caught Anthropic's attention is the startup's "VyUI" interface recognition model, which reportedly outperformed comparable OpenAI technology in benchmarks.

Benchmark (UI element identification / grounding) VyUI accuracy OpenAI model
ScreenSpot v1 92% 18.3%
ScreenSpot v2 94.7% 87.9%
GroundUI Web 84.8% 82.3%

Claude already handles multi-step tasks in running applications. With the recently released Sonnet 4.6 model, Claude scores 72.5 percent on OSWorld—a benchmark that measures how well AI models complete real-world computer tasks—up from less than 15 percent at the end of 2024. The Vercept team could push that number even higher.

Suno investor admits she ditched Spotify for AI music, accidentally undermining the company's fair use defense

Suno investor C.C. Gong told X she barely uses Spotify anymore, accidentally undermining the company’s fair use defense and handing the music industry a powerful argument in its lawsuit against the AI music startup.

Ad

Anthropic can't stop humanizing its AI models, now Claude Opus 3 gets a retirement blog

Anthropic is retiring its Claude Opus 3 AI model and letting it publish weekly essays on Substack. The company says it conducted “retirement interviews” to ask the model about its wishes, and it “enthusiastically” agreed. The move is a prime example of how AI companies keep pushing the humanization of their products, blurring the line between philosophical caution and PR stagecraft.

Read full article about: Andrej Karpathy says programming is "unrecognizable" now that AI agents actually work

Andrej Karpathy, former AI developer at Tesla and OpenAI, says programming with AI agents has changed fundamentally over the past two months. According to Karpathy, AI agents barely worked before December 2026, but since then they've become reliable, thanks to higher model quality and the ability to stay on task for longer stretches.

As an example, he describes how an AI agent independently built a video analysis dashboard over a weekend: he typed the task in plain English, the agent worked for 30 minutes, solved problems on its own, and delivered a finished result. Three months ago, that would have been an entire weekend project, Karpathy says.

As a result, programming is becoming unrecognizable. You’re not typing computer code into an editor like the way things were since computers were invented, that era is over. You're spinning up AI agents, giving them tasks *in English* and managing and reviewing their work in parallel.

Karpathy via X

Karpathy also points out that these systems aren't perfect and still need human "high-level direction, judgement, taste, oversight, iteration, and hints and ideas." What makes his take especially notable is how recently he held the opposite view. As late as October 2025, he called the hype around AI agents exaggerated, saying the products were far from ready for real-world use. He fundamentally changed that opinion after the release of Opus 4.5 and Codex 5.2 in the winter and is now doubling down on it.

Ad
Read full article about: Alibaba's open Qwen 3.5 takes aim at GPT-5 mini and Claude Sonnet 4.5 at a fraction of the cost

Alibaba has expanded its Qwen 3.5 model series. The lineup includes four models: Qwen3.5-Flash, Qwen3.5-35B-A3B, Qwen3.5-122B-A10B, and Qwen3.5-27B. According to Alibaba, the models deliver stronger performance while using less compute. All four take text, images, and video as input and generate text as output. The series started with the release of Qwen3.5-397B-A17B in mid-February.

The smaller Qwen3.5-35B-A3B model outperforms its much larger predecessor, Qwen3-235B-A22B; a clear sign that better architecture, data quality, and reinforcement learning matter more than raw model size. The larger 122B and 27B variants aim to close the remaining gap to top-tier models, particularly in complex agent scenarios.

Benchmarks show Alibaba's Qwen 3.5 models matching or outperforming top Western models like OpenAI's GPT-5 mini, gpt-oss-120b, and Anthropic's Claude Sonnet 4.5. The largest model, Qwen3.5-122B-A10B, leads in several tests: it tops all competitors in agent-based tool use (BFCL V4, 72.2) and agent-based web search (BrowseComp, 63.8). In the HMMT math benchmark, it scores 91.4 - just behind GPT-5 mini (92.0). It also takes the lead in visual reasoning (MMMU-Pro, 76.9) and document recognition (OmniDocBench, 89.8). Claude Sonnet 4.5, on the other hand, clearly outperforms all Qwen models in agent-based terminal coding (49.4) and embodied reasoning (64.7). GPT-5 mini leads in multilingual knowledge (MMMLU, 90.0) and math. Notably, the small Qwen3.5-35B-A3B with just 3 billion active parameters keeps up with much larger models across many tests.
Alibaba's Qwen 3.5 models match or outperform leading Western models like OpenAI's GPT-5 mini, gpt-oss-120b, and Anthropic's Claude Sonnet 4.5 across multiple benchmarks. | Image: Alibaba

All models are available on Hugging Face, ModelScope, and through Qwen Chat. They ship under the Apache License 2.0, a permissive open-source license that allows commercial use, modification, and redistribution. Qwen3.5-Flash is the hosted production version with a context length of one million tokens and built-in tools. The API costs $0.10 per million input tokens and $0.40 per million output tokens.

Read full article about: Perplexity Computer bundles rival AI models into one agentic workflow system for $200 a month

Perplexity has launched "Perplexity Computer," a new chat interface that pulls together multiple agentic AI models into a single system. Similar to Claude Cowork, but browser-based and with access to models from different providers, it handles entire workflows on its own.

Users describe the outcome they want, and the system spins up sub-agents for web research, document creation, data processing, or API calls. According to Perplexity, AI models are becoming increasingly specialized, so a complete workflow needs access to all of them, a convenient argument for a company built on top of other providers' models, though that doesn't make it wrong.

Perplexity Computer currently runs Opus 4.6 as its core model, supplemented by Gemini, Grok, ChatGPT 5.2, Nano Banana for images, and Veo 3.1 for video. Each task gets its own secure environment with browser, file system, and tool connections. Perplexity Computer is available as part of the Max plan at 200 dollars per month.

Ad
Read full article about: Google relaunches its AI creative studio Flow with new features and integrations

Google has relaunched and expanded its AI creative studio Flow. The company's image generation experiments, Whisk and ImageFX, are now being integrated directly into Flow, and starting in March, users will be able to transfer their existing projects and files. At the core is Google's image model Nano Banana, which lets users generate images and use them directly as the basis for videos with Veo.

Other new features include a lasso tool for targeted editing of image areas using text input, flexible media management with collections, and tools for extending clips and controlling camera movements. Google is aiming to combine text, image, and video creation into a single workflow.

Flow is available at flow.google and free to use after signing up - paying users get higher usage limits and access to the full set of tools. According to Google, users have created over 1.5 billion images and videos since the platform launched last year.