Wikipedia Content Assessment Criteria Changes: What Editors Need to Know in 2026

For years, the little colored boxes at the top of WikiProjects are collaborative groups within Wikipedia that focus on specific topics or themes to improve article quality and coverage pages have served as a reliable compass for readers. They told you if an article was a stub, a good start, or a featured masterpiece. But in 2026, that compass is spinning. The Wikimedia community has finalized significant updates to how content is assessed, moving away from static labels toward dynamic, data-driven quality signals.

If you are an editor, a researcher, or just someone who trusts Wikipedia, these changes matter. They affect how articles are prioritized for improvement, how bots interact with text, and ultimately, what you see when you click a link. The old system relied heavily on human volunteers manually tagging pages. The new framework integrates automated analysis with community consensus, aiming to reduce bias and increase accuracy.

The Shift from Static Tags to Dynamic Signals

Traditionally, Wikipedia article grades are a classification system used by WikiProjects to rate the quality of articles based on completeness, accuracy, neutrality, and prose style were static. An article rated "Start" could sit there for five years without change, even if it had been significantly expanded. The new criteria introduce a concept called "dynamic decay." If an article hasn't been reviewed against current standards in two years, its grade automatically drops one tier until a human editor re-evaluates it. This forces continuous maintenance rather than one-time perfection.

This shift addresses a long-standing pain point: the backlog of unreviewed improvements. In the past, editors would polish an article, but without a formal reassessment, it remained labeled as lower quality. Now, the system assumes that lack of recent review equals potential staleness. It’s a nudge, not a punishment, designed to keep the encyclopedia fresh.

Automated Pre-Assessment Tools

A major component of the 2026 update is the integration of ORES (Objective Revision Evaluation Service) is a machine learning tool developed by the Wikimedia Foundation to predict the quality and likelihood of vandalism in Wikipedia edits directly into the assessment workflow. Previously, ORES was mostly used for detecting vandalism. Now, it provides preliminary quality scores before a human editor even opens the page. When you visit a WikiProject dashboard, you’ll see a predicted grade alongside the official human-assigned grade. If they differ significantly, the article is flagged for priority review.

This doesn’t mean AI replaces editors. It means editors spend less time checking basic criteria like citation density or lead section length, and more time evaluating nuance, tone, and structural logic. For large WikiProjects like WikiProject Medicine is a specialized group of Wikipedia editors focused on improving medical and health-related articles to ensure high accuracy and adherence to medical guidelines, this automation has reduced assessment backlogs by nearly 40% in pilot programs.

New Quality Tiers and Granularity

The classic A, B, C, Start, Stub hierarchy remains, but new intermediate tiers have been introduced to better reflect modern editing realities. The most notable addition is the "C+" and "B-" designations. These acknowledge that an article might be structurally sound but lack depth in certain areas, or vice versa. This granularity helps editors target specific weaknesses. Instead of a vague "improve this," a "B-" rating might indicate strong prose but weak sourcing in the history section.

Additionally, the definition of "Featured Article" has tightened. It now requires explicit demonstration of "encyclopedic scope," meaning the article must cover the topic comprehensively enough to stand alone as a reference work, not just be well-written. This has led to a slight decrease in the number of Featured Articles but a noticeable increase in their reliability for academic use.

Comparison of Old vs. New Assessment Criteria
Feature Pre-2026 System 2026 Updated System
Review Frequency Manual, ad-hoc Dynamic decay triggers auto-review
AI Integration Vandalism detection only Predictive quality scoring via ORES
Granularity Broad tiers (A, B, C, Start) Intermediate tiers (A-, B+, C+, etc.)
Scope Requirement Implicit Explicit "encyclopedic scope" mandate
Editor viewing Wikipedia dashboard with AI and human quality scores compared

Impact on WikiProject Governance

WikiProject governance refers to the decentralized decision-making processes and leadership structures that manage specific topic areas within Wikipedia is undergoing a subtle but important shift. Project leaders are no longer just arbiters of taste; they are managers of data. The new criteria require projects to publish quarterly reports on assessment accuracy and reviewer diversity. This transparency aims to combat systemic bias, where certain topics (like Western history) received higher grades due to larger pools of experienced editors.

Projects focusing on underrepresented regions or niche technical fields now receive additional support tools. For example, WikiProject Africa is a collaborative effort to expand and improve Wikipedia's coverage of African countries, cultures, and histories has seen a surge in activity because the new system highlights "orphaned" high-potential articles that need expert attention. The algorithm identifies articles with high traffic but low quality, directing volunteer energy where it’s needed most.

Challenges and Community Pushback

No change on Wikipedia happens without debate. Some veteran editors argue that dynamic decay creates unnecessary churn. They worry that editors will feel pressured to constantly tweak articles just to maintain a grade, rather than focusing on substantive research. There’s also concern about over-reliance on ORES. While accurate, the model can struggle with nuanced topics like philosophy or contemporary politics, where "neutrality" is complex and context-dependent.

To address this, the community established a "Human Override" mechanism. If three experienced editors agree that an article’s automated score is misleading, they can lock the grade for six months while they conduct a deep manual review. This balance between efficiency and human judgment is the core of the new policy’s success.

Abstract art of editors forming a shield against algorithmic bias via consensus

What This Means for Readers

You won’t see dramatic visual changes on your screen. The color-coded boxes remain. But behind the scenes, the trustworthiness of those boxes has increased. When you see a "B" grade today, it’s more likely to reflect a recent, rigorous evaluation than a tag slapped on in 2018. For students and professionals citing Wikipedia, this means fewer surprises regarding outdated information. The system is designed to make the encyclopedia self-correcting, ensuring that high-quality content stays visible and low-quality content gets flagged for help.

Next Steps for Editors

If you contribute to Wikipedia, here’s how to adapt:

  • Check your WikiProject dashboards for new "Priority Review" flags.
  • Trust but verify ORES predictions; use them as starting points, not final verdicts.
  • Participate in quarterly reviews to help calibrate your project’s standards.
  • Focus on "encyclopedic scope" when aiming for higher grades; don’t just polish prose.

Will my old Wikipedia article grades disappear?

No, existing grades remain valid until they undergo dynamic decay or manual review. However, if an article hasn't been checked in two years, its grade may automatically drop one tier to signal that it needs a fresh look.

Does AI now decide Wikipedia article quality?

Not entirely. AI tools like ORES provide predictive scores to assist editors, but final grades still require human consensus. The AI acts as a triage tool to highlight articles needing attention, not as the final judge.

Why were intermediate grades like B+ added?

Intermediate grades offer more precise feedback. They help editors understand exactly where an article stands-whether it’s strong in structure but weak in sources, or vice versa-making improvement efforts more targeted and efficient.

How does dynamic decay work?

If an article isn't reviewed against current standards within two years, its quality grade automatically decreases by one level. This encourages regular maintenance and ensures that grades reflect the article's current state, not its past glory.

Can I override an AI-generated quality score?

Yes. If multiple experienced editors believe an AI prediction is inaccurate, they can trigger a Human Override. This locks the grade temporarily while a thorough manual review is conducted to ensure fairness and accuracy.