Building a Telegram bot to babysit 24,000 developers

The PythonID Telegram group has around 24,000 members. Indonesian developers chatting about Python, Django, FastAPI, job openings, and occasionally, cryptocurrency scams disguised as job offers.

Moderation was manual. Admins would spot a spammer, ban them, delete the messages, and then do it again twenty minutes later. Bots existed but they were either too dumb (kick anyone without a profile photo) or too complex (require a PhD in YAML to configure). So I wrote my own.

What PythonID-bot actually does

The bot watches every message in the group. When a new user joins, it checks two things: do they have a profile photo, and do they have a username set. If not, they get a warning posted to a dedicated topic thread. If they ignore the warning for three hours, they get muted.

That alone cut the spam by maybe 60%. Turns out most spam accounts don’t bother setting a profile picture.

But 60% is not enough when you’re getting hit with crypto and gambling spam daily.

Captcha

New members can optionally be required to solve a button-based captcha within 60 seconds. It’s not a hard captcha. Press the right button. But it stops the bots that just join, dump a message, and leave.

The captcha system also survives bot restarts. If the bot goes down while someone has a pending captcha, it recovers all pending challenges on startup and either re-schedules the timeout or expires them if time already ran out. I didn’t think about this until the bot crashed during a spam wave and five people were stuck in limbo.

Probation

Even after passing captcha, new users can’t send links, forwarded messages, or external quotes for three days. Some domains are whitelisted (docs.python.org, github.com, stackoverflow.com) because telling a new Python developer they can’t share a docs link would be absurd.

The whitelist also includes about 150 Indonesian Telegram community groups so new users can share links to other local tech communities. First violation gets a warning. Third violation gets a mute. The message gets deleted either way.

DM unrestriction

This is the part I’m most pleased with. When someone gets muted, they get a link to DM the bot. The bot checks their profile, and if they’ve fixed whatever was wrong (added a photo, set a username), it automatically lifts the restriction and posts a note to the admin topic.

Before this existed, restricted users had to message an admin and wait for someone to manually check and unrestrict them. Sometimes that took hours.

The multi-group problem

The bot started as a single-group thing. One .env file, one group_id, done. Then other Indonesian developer communities asked if they could use it too.

Supporting multiple groups meant rethinking most of the architecture. Hendy Santika contributed the initial multi-group refactor, introducing a GroupConfig Pydantic model and a GroupRegistry that stores per-group settings. Each group gets its own warning topic, captcha toggle, and probation duration.

Configuration moved from .env to a groups.json file:

[
  {
    "group_id": -1001234567890,
    "warning_topic_id": 123,
    "captcha_enabled": true,
    "warning_time_threshold_minutes": 180
  }
]

The .env fallback still works for single-group setups. I didn’t want to break the simple case.

Everything that broke

The multi-group refactor surfaced problems I hadn’t thought about.

The captcha callback data was captcha_verify:{user_id}. If a user joined two groups at the same time, the bot couldn’t tell which group the captcha was for. Fixed it to captcha_verify_{group_id}_{user_id}.

The scheduler that auto-restricts users ran in a loop over all groups. If the bot got kicked from one group, the entire loop crashed and no other group got processed. Wrapping each group’s API call in a try/except fixed that.

The DM unrestriction flow had a similar issue. A restricted user messages the bot privately, the bot checks their profile and lifts the restriction. But with multiple groups, it needed to check and unrestrict across all monitored groups, and handle cases where it no longer has access to some of them.

I caught all of these during code review on the pull requests, before they hit production. They all looked obvious once I spotted them.

The Markdown incident

Telegram’s Markdown v1 parser is fragile. I spent an embarrassing amount of time debugging why some warning messages showed up as raw text instead of formatted messages.

Two causes:

Parentheses inside link text break the parser. [Contact Bot (for help)](url) does not work. Moving the parenthesized text outside the brackets fixed it.
Usernames with underscores (@Sharo_Kenne) get interpreted as italic markers. If the underscore doesn’t have a matching close, the entire message falls back to raw text.

The fix was calling escape_markdown(username, version=1) for every username mention. A one-liner that took three hours to figure out.

The stack

Python 3.11+ with python-telegram-bot v20
SQLite with WAL mode for the database (SQLModel/SQLAlchemy)
Pydantic for configuration validation
Logfire for structured logging
uv as the package manager
Docker and Docker Compose for deployment
442 tests, 99% coverage

The startup validates configuration (group IDs must be negative, timeouts must be within sane ranges) and the database sets WAL mode and synchronous=NORMAL on init. These things sound boring but they prevent the kind of bugs that only show up at 2 AM on a Saturday.

What I’d do differently

I’d start with multi-group support from day one. Retrofitting it was the hardest part of the project. The single-group assumption was baked into every handler, every database query, every scheduler job. Pulling it out was like removing a load-bearing wall.

I’d also write the DM unrestriction flow earlier. It reduced admin workload more than any other feature. People fix their profiles quickly when they know there’s an automated way back in.

I run it for PythonID on my VPS alongside Miniflux, Mastodon, and a handful of other self-hosted services. JVM Indonesia, IDDevOps, and KotlinID run their own instances. Other communities picked it up and deployed it themselves without me having to do anything, which is exactly how open source should work.

The source code is on GitHub if you want to look at it or run your own instance.

Building a Telegram bot to babysit 24,000 developers

Related Posts