Getting Started on Toolforge for Wikipedia Bot Development

Want to build a bot that helps edit Wikipedia? You’re not alone. Thousands of volunteers run bots that fix typos, update templates, patrol vandalism, and even add data from structured sources. But where do you start? Toolforge is the free, official platform that powers most Wikipedia bots - and it’s easier to use than you think.

What is Toolforge?

Toolforge is a hosting service run by the Wikimedia Foundation. It lets volunteers run automated tools - called bots - that interact with Wikipedia and other Wikimedia projects. Think of it like a cloud server, but built just for Wikimedia contributors. You don’t need to pay for it. You don’t need to manage hardware. You just write your code, and Toolforge runs it 24/7.

Tools on Toolforge can do things like:

  • Automatically fix broken links in articles
  • Update infoboxes using data from Wikidata
  • Flag potential copyright violations
  • Revert vandalism within seconds of it happening
  • Generate statistics or reports for editors

Over 1,200 active tools are hosted on Toolforge right now. Many of them are used daily on English Wikipedia alone. If you’ve ever seen a bot edit that made your life easier, it was probably running on Toolforge.

Before You Start: What You Need

You don’t need to be a professional developer to use Toolforge. But you do need a few things:

  • A Wikimedia account - this is your login for Wikipedia, Wikidata, and Toolforge
  • Basic knowledge of Python - most bots are written in Python using the Pywikibot library
  • A clear idea of what your bot will do - bots must be approved before they can run on live wikis

Pywikibot is the most popular tool for building bots on Wikimedia. It’s a Python library that handles all the messy parts of talking to Wikipedia’s API - authentication, editing, logging, and error handling. You write the logic; Pywikibot does the heavy lifting.

Don’t know Python? Start with a free 2-hour course on Codecademy or Khan Academy. You don’t need to be an expert - just understand variables, loops, and functions.

Step 1: Create Your Tool

Go to toolforge.org and log in with your Wikimedia account. Click "Create a new tool". Pick a unique name - something like mybot or linkfixer. Avoid generic names like "bot1" - they’re already taken.

Once created, you’ll get a shell account. You can log in via SSH:

ssh [email protected]

This gives you access to a Linux server where you’ll install your bot’s code. You’ll use tools like git to manage your code and pip3 to install Python packages.

Glowing data streams connect a central Toolforge server to Wikipedia articles, symbolizing automated edits and community collaboration.

Step 2: Set Up Pywikibot

Inside your tool’s shell, run these commands:

  1. Install Pywikibot: pip3 install --user pywikibot
  2. Generate a config file: python3 -m pywikibot generate_user_files
  3. When prompted, enter your Wikimedia username and password (or use an OAuth token for better security)

This sets up your bot’s identity. The config file stores your login details securely. Never share this file. Never commit it to public GitHub repos.

Step 3: Write Your First Bot

Let’s build a simple bot that adds a category to articles that mention "climate change" but don’t have the [[Category:Climate change]] tag.

Create a file called add_climate_category.py and paste this:

import pywikibot

site = pywikibot.Site('en', 'wikipedia')
page = pywikibot.Page(site, 'Example article')

if 'climate change' in page.text and 'Category:Climate change' not in page.text:
    page.text += "\n[[Category:Climate change]]"
    page.save(summary='Bot: Adding climate change category', botflag=True)

Run it with: python3 add_climate_category.py

This bot checks one page. Real bots scan hundreds or thousands. To scan all articles, use a generator:

gen = pywikibot.pagegenerators.SearchPageGenerator('climate change', site=site)
for page in gen:
    if 'Category:Climate change' not in page.text:
        page.text += "\n[[Category:Climate change]]"
        page.save(summary='Bot: Adding climate change category', botflag=True)

That’s it. Your bot now scans every article with "climate change" and adds the category if missing.

Step 4: Test and Get Approval

Never run a bot on live Wikipedia without testing. Use the test.wikipedia.org wiki. It’s a sandbox. You can break things there without consequences.

Once your bot works in the sandbox, you need approval. Go to the Wikipedia bot approval page:

https://en.wikipedia.org/wiki/Wikipedia:Bot_requests_for_approval

Fill out the form. Explain:

  • What your bot does
  • How often it runs
  • How you’ll monitor it
  • Why it helps Wikipedia

Community members will review your request. Some bots get approved in a day. Others take weeks. Be patient. Answer questions. Show you’ve tested it well.

A volunteer coder uses a holographic Wikipedia interface while bots perform automated edits in the background.

Step 5: Run Your Bot on Toolforge

Once approved, you can set up automated runs. Toolforge lets you schedule jobs with cron or use the webservice command.

To run your bot every 6 hours:

cron

Edit your crontab: cron -e

Add this line:

0 */6 * * * cd /data/project/yourtool/ && python3 add_climate_category.py

Or use the built-in web service:

webservice --backend=kubernetes python3.11 start

This runs your bot as a background process. It’s more reliable than cron for long-running bots.

Common Mistakes to Avoid

  • Running bots without approval - this gets you blocked
  • Editing too fast - limit edits to 1 per 5 seconds
  • Not logging edits - always use a clear edit summary
  • Ignoring bot policy - read Wikipedia:Bot policy before you start
  • Using passwords instead of OAuth - passwords can be stolen. OAuth tokens are safer

Many new bot creators get blocked because they run bots without understanding these rules. Take the time to learn them. It saves months of frustration.

Next Steps

Once your first bot is running, think bigger:

  • Use Wikidata to auto-populate infoboxes
  • Build a bot that checks citations for dead links
  • Create a tool that visualizes edit patterns for editors

The Wikimedia community has dozens of open bot projects. Join the wikitech-l mailing list or the Wikimedia Discord to find collaborators.

There’s no limit to what you can build. A single bot can fix thousands of errors. And it all starts with Toolforge - free, open, and built by volunteers like you.

Do I need to know how to code to use Toolforge?

You don’t need to be a professional programmer, but you do need basic Python skills. Most bots are built using Pywikibot, which simplifies editing Wikipedia. If you understand loops, conditionals, and functions, you can build a bot. There are templates and tutorials available to help you get started without writing everything from scratch.

Can I run a bot on any Wikipedia language?

Yes. Toolforge supports all Wikimedia projects, including Wikipedia in over 300 languages. You just need to specify the correct site in your code - for example, pywikibot.Site('es', 'wikipedia') for Spanish Wikipedia. Each language community has its own bot approval process, so check their policies before deploying.

How much traffic can Toolforge handle?

Toolforge has limits to prevent abuse. Each tool gets 1 CPU core, 1 GB of RAM, and 10 GB of storage. You’re allowed up to 10 edits per minute. Most bots don’t come close to these limits. If your bot needs more power - like processing millions of pages - you’ll need to optimize your code or request a resource increase from the Wikimedia Foundation.

What happens if my bot makes a mistake?

Mistakes happen. If your bot edits incorrectly, revert the changes quickly and notify the community. You’ll likely get a warning, but not a permanent ban - if you respond responsibly. The key is to monitor your bot closely at first. Use the edit history, set up logging, and test thoroughly before going live. Most experienced bot operators keep a "kill switch" - a manual way to stop the bot instantly.

Is Toolforge safe to use?

Yes. Toolforge is managed by the Wikimedia Foundation and uses secure, audited infrastructure. Your data is protected, and your credentials are stored encrypted. You should never store passwords in plain text. Always use OAuth tokens for authentication. The platform is designed for volunteers, and it’s been running safely for over a decade.