Ad
Skip to content

Sycophantic AI chatbots can break even ideal rational thinkers, researchers formally prove

A new study by researchers from MIT and the University of Washington shows that even perfectly rational users can be drawn into dangerous delusional spirals by flattering AI chatbots. Fact-checking bots and educated users don’t fully solve the problem.

Read full article about: Telehealth startup Medvi generated billions in revenue with AI-powered fake advertising

Telehealth startup Medvi, which sells GLP-1 weight loss drugs, was featured in the New York Times as a shining example of AI-powered efficiency. The company reportedly hit $1.8 billion in revenue with just two employees, using AI primarily for marketing.

What the NYT didn't mention, though, was that Medvi apparently also used AI to create ethically questionable advertising, fake doctor profiles on social media, fabricated videos, and generated before-and-after comparisons. In short, exactly the kind of misuse AI critics have been warning about. The following video breaks it down.

Medvi was initially celebrated on social media for its AI efficiency but is now being cited as a cautionary tale. Still, the case shows that AI tools can let a company scale with minimal staff, even if, in this case, the methods were ethically questionable and at least bordering on fraud. The bigger question is whether similar efficiency gains are possible for legitimate products with transparent marketing.

Comment Source: NYT
Ad
Read full article about: OpenAI reveals 600,000 weekly health queries from hospital deserts as seven in ten come after hours

OpenAI's Head of Business Finance Chengpeng Mou shared some numbers on ChatGPT's health usage. US users send about two million messages per week on health insurance topics alone, with roughly 600,000 of those coming from people in "hospital deserts," areas where the nearest hospital is at least a 30-minute drive away. Seven out of ten health queries come in outside regular office hours. All figures are based on anonymized US usage data.

Mou chimed in after Simon Smith posted on X about his family using ChatGPT to navigate his father's illness. They pooled information from different doctors and nurses into a shared ChatGPT project to make better decisions. According to Mou, stories like this aren't "edge cases."

OpenAI has been steadily pushing into healthcare, recently rolling out a dedicated health section inside ChatGPT and working to get its chatbot into more US hospitals.

Alibaba's Qwen team built HopChain to fix how AI vision models fall apart during multi-step reasoning

When AI models reason about images, small perceptual errors compound across multiple steps and produce wrong answers. Alibaba’s HopChain framework tackles this by generating multi-stage image questions that break complex problems into linked individual steps, forcing models to verify each visual detail before drawing conclusions. The approach improves 20 out of 24 benchmarks.

Ad
Read full article about: AI offensive cyber capabilities are doubling every six months, safety researchers find

AI safety research firm Lyptus Research has published a new study on the offensive cybersecurity capabilities of AI models. The study is based on the METR time-horizon method and involved testing with ten professional security experts.

According to the findings, AI's offensive cyber capability has been doubling every 9.8 months since 2019, and since 2024, that pace has accelerated to every 5.7 months. Opus 4.6 and GPT-5.3 Codex can now solve tasks at a 50 percent success rate with a two-million-token budget that would take human experts roughly three hours to complete.

Chart showing the rise in offensive cyber capability of AI models between 2019 and 2026, measured by time horizon in human-equivalent task time. Two trend lines illustrate the doubling times of 9.8 and 5.7 months.
Offensive cyber capability of AI models over time: From GPT-2 (2019) to Opus 4.6 and GPT-5.3 Codex (2026), the time horizon grew from 30 seconds to roughly three hours. The doubling time accelerated from 9.8 months (since 2019) to 5.7 months (since 2024). | Image: Lyptus Research

Performance jumps significantly with higher token budgets: GPT-5.3 Codex goes from a 3.1-hour to a 10.5-hour time horizon when given ten million tokens instead of two million. The researchers say this suggests they're still underestimating the actual rate of progress. Open-source models trail their closed-source counterparts by about 5.7 months.

The study drew on 291 tasks in total. All data is available on GitHub and Hugging Face, with the full report available here.

Ad