Ad
Skip to content

Meta's own supervisory body warns that Community Notes are no match for AI disinformation

Image description
Midjourney prompted by THE DECODER

Meta's Oversight Board has examined the planned global expansion of Community Notes. Its conclusion: the system is too slow, too thinly staffed, and vulnerable to manipulation, especially given the growing flood of AI-generated disinformation. In certain countries, Meta should not introduce the program at all.

Meta's Oversight Board reaches an unsurprising conclusion in a sweeping analysis: Community Notes, the system Meta used to replace professional fact-checking in the United States, has significant weaknesses. "Delays in note publication, the limited number of published notes and its dependence on the broader information environment's reliability raise serious doubts about the extent to which community notes can meaningfully address misinformation linked to harm," the board writes.

Meta announced the introduction of Community Notes at the start of U.S. President Donald Trump's second term, simultaneously ending its professional fact-checking program, which had been running for roughly a decade.

The problem is compounded by a development the Board explicitly identifies: AI-powered tools are facilitating the scaled creation and management of accounts and networks that could manipulate the system. The coverage gap is enormous: according to Meta, just around 900 Community Notes were published in the first six months of the U.S. rollout. Over the same period in the EU, professional fact-checkers enabled Meta to apply labels to approximately 35 million Facebook posts, as Angie Drobnic Holan, director of the International Fact-Checking Network, notes.

Only six percent of proposed notes are ever published

The Community Notes system is built on the open-source algorithm from X (formerly Twitter). Users can propose contextual annotations on public posts. Other users rate these as "helpful" or "not helpful." A note is only published once a so-called bridging algorithm determines that users who typically disagree with each other have rated the note as helpful.

In practice, most notes fail to clear this hurdle. According to a September 2025 update from Meta, only about six percent of all proposed notes are ever published. On X, the rate stands at 8.3 percent according to one study, with an average delay of 26 hours until publication, "well past the point of peak visibility for most misleading posts." Another analysis puts the average delay at 65.7 hours. Between January 2021 and January 2025, 87.7 percent of all notes proposed on X remained in the "Needs More Ratings" category, according to the same study, without ever being published.

A crucial difference from the former fact-checking program: content that receives a Community Note is neither downranked nor excluded from recommendations, according to the Board. There are "no strikes for posting content that receives a community note," and no effects on reach or monetization. Under the professional fact-checking program, by contrast, content rated false or misleading could be demoted in distribution and rejected for ads.

AI-generated manipulation as a growing threat

The Oversight Board explicitly warns about the system's vulnerability to coordinated manipulation, which is significantly facilitated by AI tools. "This risk will only become more acute as artificial intelligence facilitates the scaled creation and operation of accounts and networks," the analysis states. The Board also sees risks in AI-powered contributors: "Malicious actors could fine-tune models to subtly favor narratives, selectively frame evidence or exploit the rating mechanism - all while appearing neutral."

Recent research on X's Community Notes also shows that "a small minority (5-20%) of bad raters can strategically suppress targeted helpful notes." Another vulnerability: published notes do not "lock" until two weeks after consensus is reached, according to the Board. During that window, coordinated actors could remove a note through a flood of negative ratings.

Meta told the Board that it "does not plan to allow AI note writers (i.e., AI-powered chatbots or agents) to submit community notes on Meta's platforms. Contributors may use AI to help them write notes; however, a human must submit the note under their name." The company also stated that, to date, it "has not detected any coordinated inauthentic behavior or gaming of the program." Whether Meta's safeguards are adequate for the potential scale of the threat, however, "is not clear from the information provided to the Board."

Southport riots: A single note across more than a thousand disinformation posts

The system's weaknesses are particularly evident in crisis situations. The Board points to an investigation of the Southport riots in the UK in 2024: five accounts pushing false information amassed over 430 million views. Of the 1,060 posts shared by these accounts during the height of the riots, only one received a community note.

The Board draws a clear conclusion: Community Notes should not be introduced in countries experiencing crises or protracted conflict. In such situations, thresholds for incitement to violence are lower, and notes targeting specific groups "can more easily result in offline harm." That this is a well-founded concern is illustrated by Meta's role in the genocide of minority groups in Myanmar and Ethiopia through failures to moderate hateful content. In 2018, Facebook apologized for its role in "offline violence" in Myanmar.

Particularly problematic: Meta, according to the Board, "has not developed provisions regarding the use of the product in crisis situations, including adapting, modifying, or suspending the feature."

Minorities could be systematically disadvantaged

A structural weakness of the system concerns its algorithm. It models societal polarization along a single axis, as the Board notes. "Meta has not provided any information that suggests its program will be substantively different" from X's. In countries where social division cannot be reduced to a single axis, for instance because political, ethnic, religious, and linguistic conflicts overlap, this can lead to the systematic marginalization of minorities.

The Board describes a concrete scenario: when dominant groups share a mutual prejudice against a minority group, that prejudice can serve as the "bridge" between otherwise disagreeing majority groups. Harmful notes targeting minorities could then reach the consensus threshold and be published. A consortium of South Asian NGOs presented the Board with evidence of precisely such dynamics in X's Community Notes in India, where political divisions reflect complex and overlapping affiliations spanning ethnicity, religion, language, and caste.

There is also a linguistic dimension: the system currently operates in only six languages (English, Spanish, Chinese, Vietnamese, French, and Portuguese). Research shows that notes in non-English languages on X are rated and published far less frequently. In countries with repressive human rights records, the Board also sees risks to the safety of contributors should their anonymity be compromised: risks to the right to privacy (Article 17, ICCPR), security of the person (Article 9, ICCPR), and even the right to life (Article 6, ICCPR).

The Board recommends a staggered rollout with strict exclusion criteria

The Oversight Board sets out a series of concrete recommendations for the planned international expansion. Countries with repressive human rights records and weak civil societies should be omitted from the rollout until Meta can demonstrate robust contributor privacy protections, "with evidence of red-teaming under adversarial conditions, a clear policy on handling requests for community notes data from law enforcement agencies and the presence of risk mitigation measures." The same applies to countries experiencing active crises or conflicts and countries with a history of coordinated disinformation networks.

For elections, the Board recommends particular caution: where Meta determines through product testing, risk assessment, and human rights due diligence that its safeguards are insufficient, Community Notes "should not be introduced in advance of or during major elections." For countries with multidimensional social divisions, the Board recommends "extreme caution" without categorically excluding them. Countries facing persistent obstacles to internet access should also be omitted.

Meta is to provide the Board with the criteria or risk matrix it develops to guide expansion every six months during the initial rollout. The Board also calls for "substantial transparency, reporting and researcher access to data on Meta's community notes performance." However, Meta is not legally required to comply with the Board's recommendations.

Fact-checking and Community Notes are not mutually exclusive

The Board explicitly emphasizes that community notes and professional fact-checking "should not be seen as mutually exclusive tools." Research shows that community notes on X cite "fact-checking sources up to five times more than previously reported." Fact-checking organizations are the "third most used reference globally" in published notes, according to another study. Reducing support for fact-checkers would therefore also undermine the quality of Community Notes itself, as contributors would have fewer reliable sources to cite.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

AI news without the hype
Curated by humans.

  • Over 20 percent launch discount.
  • Read without distractions – no Google ads.
  • Access to comments and community discussions.
  • Weekly AI newsletter.
  • 6 times a year: “AI Radar” – deep dives on key AI topics.
  • Up to 25 % off on KI Pro online events.
  • Access to our full ten-year archive.
  • Get the latest AI news from The Decoder.
Subscribe to The Decoder