Meta claims hate speech is low, but so is the bar

Meta claims that its Community Standards Enforcement Report for the first quarter of 2025 resulted in significant improvement in moderation accuracy while keeping harmful content at bay. This reported stability follows major policy and enforcement changes announced in January 2025. However, since these changes impact how violations such as hate speech are assessed and reported, the figures in the Q1 Transparency Report likely do not reflect what’s actually happening on Meta’s platforms.

Back in January, Meta introduced a shift in its enforcement approach. It stated that it took a  series of steps “to allow for more speech while working to make fewer mistakes.” Meta said it would keep removing “the most serious and illegal” content on its own, but for less harmful content, it would wait for user reports before taking action.

This policy change narrowed the scope of proactive moderation, where violating content is removed before a user even reports it. Under the previous policy, Meta’s systems immediately identified and acted on a wider range of harmful content, including hate speech around gender issues, especially speech targeting LGBTQIA+ people. 

Because user reporting is influenced by individual judgment and inconsistent use of reporting tools, it is generally less reliable than automated enforcement. The January update also introduced new definitions and thresholds for what qualifies as a violation, making it difficult to compare the Q1 2025 data with earlier reports. As a result, the total number of violations will be lower, not necessarily because of less harmful content, but because the bar for what is considered a violation is now so low.

This policy shift has two important consequences for the transparency report. The reported prevalence of harmful content (i.e., any content that violates Meta’s Community Standards) likely underrepresents the actual number of lower-severity violations. Even minor increases in prevalence may in fact signal a more substantial rise in harmful content that remains undetected under the relaxed enforcement standards.

For example, in the Q1 2025 report, Meta notes:

“There was a small increase in the prevalence of bullying and harassment content from 0.06%-0.07% to 0.07%-0.08% on Facebook due to a spike in sharing of violating content in March. There was also a small increase in the prevalence of violent and graphic content from 0.06%-0.07% to about 0.09%.”

While these increases are presented as minor, they must be interpreted cautiously. If the enforcement system now filters fewer incidents for action, fewer content is detected as a violation. Even a slight rise may mask a much larger trend in harmful content on the platform.Meta also attributes a significant decline in enforcement mistakes to the January policy update:

“We saw a roughly 50% reduction in enforcement mistakes on our platforms in the United States from Q4 2024 to Q1 2025 – during that same time period, the low prevalence of violating content largely remained unchanged for most problem areas.”

This reduction is framed as a moderation success. But more accurately, it is a byproduct of reduced proactive enforcement. By taking down fewer lower-severity content, the likelihood of false positives (content tagged as harmful even when it is not) decreases. In other words, the numbers may look better on paper, but this is partly because Meta is simply “looking” for fewer violations, not necessarily because the platform is safer.

In fact, Meta states:“We’re publishing our first quarter reports for 2025, including the Community Standards Enforcement Report, where following the changes announced in January we’ve cut enforcement mistakes in the U.S. in half, while during that same time period the low prevalence of violating content on platform remained largely unchanged for most problem areas.”

This suggests that the policy changes didn’t increase the amount of harmful content on the platform, and may have even reduced wrongful takedowns. However, such an interpretation is misleading as the prevalence of violating content is now measured under the more relaxed standards of January 2025. 

These concerns are also echoed by civil society organizations monitoring the implications of Meta’s enforcement changes. GLAAD’s 2025 report, “Make Meta Safe,” highlights a noticeable increase in harmful content targeting LGBTQIA+ people and other marginalized groups following Meta’s January rollbacks. It warns that the company’s reduced proactive enforcement has led to greater visibility of hate speech and harassment online, disproportionately affecting vulnerable populations.

The Oversight Board’s April 2025 statement adds important context to these developments. The Board expressed concern that “Meta’s shift toward a higher threshold for intervention risks leaving harmful but non-illegal content online longer, particularly in contexts where user reporting is low or unreliable.” It also called for “greater clarity in Meta’s transparency reporting to ensure changes in enforcement priorities are fully disclosed and comparable over time.”

The Board issued 17 recommendations aimed at strengthening Meta’s accountability and transparency. Among these, it called on Meta to:

  • Conduct a human rights impact assessment (HRIA) of the January 7 hateful conduct policy updates, focusing on potential adverse effects on Global Majority countries, LGBTQIA+ people (including minors), and immigrants. Meta should publicly report on its findings soon and provide progress updates every six months.
  • Clarify policy language, particularly regarding references to hateful ideologies in the Dangerous Organizations and Individuals policy.
  • Improve enforcement of Bullying and Harassment policies, especially in cases where users are required to self-report violations.
  • Continually assess the effectiveness of Community Notes and other alternative measures to content removal, particularly during events where rapid misinformation spread poses public safety risks.
  • Enhance reviewer guidance on detecting incitement to violence in visual content.

Right now, Meta’s transparency reports do not really tell us whether the reported changes in harmful content reflect shifts in user behavior or simply a narrower scope of enforcement. This lack of transparency undermines the reports’ usefulness in studying trends in harmful content on its platforms.

While Meta presents the Q1 2025 data as evidence of improved accuracy in content moderation and more stable content, these numbers reflect the new enforcement policy. The transparency reports’ metrics cannot be directly comparable to previous quarters, and the reported “small increases” in harmful content may actually downplay how much harmful content users are really seeing. Meta should disclose how policy changes affect these numbers so its moderation practices are more accurately evaluated. 
The post Meta claims hate speech is low, but so is the bar appeared first on SMEX.