Over-Engineering a Clap Button, Again

In the previous post, I built a clap button for this blog using Cloudflare Workers and KV. It worked. Mostly.

The problem was concurrency. KV is an eventually consistent key-value store. When two people clap at the same time, both reads return the same value, both increment by one, and one clap disappears. Read-modify-write on KV is not atomic.

For a blog that gets three readers on a good day, this probably didn’t matter. But it bugged me. I also wanted to stop storing raw IP addresses for rate limiting. So I migrated the backend to Cloudflare D1, their SQLite-at-the-edge database.

The code is on GitHub. The frontend didn’t change at all.

The schema

D1 is just SQLite. I created two tables: one for clap counts, one for rate limiting.

-- migrations/0001_init.sql
CREATE TABLE IF NOT EXISTS claps (
  slug TEXT PRIMARY KEY,
  count INTEGER NOT NULL DEFAULT 0,
  updated_at INTEGER NOT NULL
);

CREATE TABLE IF NOT EXISTS rate_limits (
  ip_hash TEXT NOT NULL,
  slug TEXT NOT NULL,
  window_start INTEGER NOT NULL,
  count INTEGER NOT NULL DEFAULT 0,
  updated_at INTEGER NOT NULL,
  PRIMARY KEY (ip_hash, slug, window_start)
);

CREATE INDEX IF NOT EXISTS idx_rate_limits_window_start
  ON rate_limits(window_start);

Two things to note. First, ip_hash stores a SHA-256 hash of the IP address, salted with an environment variable. No raw IPs touch the database. Second, the composite primary key (ip_hash, slug, window_start) means rate limits are scoped per user, per post, per hour window.

Atomic writes

This was the whole point of the migration. With KV, incrementing a counter required three steps: read the current value, add to it, write it back. If two requests overlap, one write clobbers the other.

With D1, I use ON CONFLICT ... DO UPDATE to make the whole thing a single statement. No read step, no race.

// Rate limit + increment in a batch
const rateLimitStmt = c.env.DB.prepare(`
  INSERT INTO rate_limits (ip_hash, slug, window_start, count, updated_at)
  VALUES (?1, ?2, ?3, ?4, ?5)
  ON CONFLICT(ip_hash, slug, window_start) DO UPDATE SET
    count = rate_limits.count + excluded.count,
    updated_at = excluded.updated_at
  WHERE rate_limits.count + excluded.count <= ?6
`).bind(ipHash, slug, windowStart, incrementBy, now, MAX_CLAPS_PER_IP);

const clapsStmt = c.env.DB.prepare(`
  INSERT INTO claps (slug, count, updated_at)
  VALUES (?1, ?2, ?3)
  ON CONFLICT(slug) DO UPDATE SET
    count = claps.count + excluded.count,
    updated_at = excluded.updated_at
  RETURNING count
`).bind(slug, incrementBy, now);

const [rateLimitResult, clapsResult] = await c.env.DB.batch([
  rateLimitStmt,
  clapsStmt,
]);

The WHERE clause on the rate limit insert is doing double duty. If the user has already exceeded the limit, the insert silently does nothing (zero rows changed). I check rateLimitResult.meta.changes after the batch and return a 429 if it’s zero. The clap count only goes up if the rate limit allows it, and it all happens in one round trip.

Hashing IPs

KV stored raw IPs as part of the rate limit key. That felt wrong. D1 gave me a reason to fix it.

async function sha256Hex(env, input) {
  const salt = env.IP_HASH_SALT || '';
  const data = new TextEncoder().encode(salt + ':' + input);
  const hash = await crypto.subtle.digest('SHA-256', data);
  return [...new Uint8Array(hash)]
    .map(b => b.toString(16).padStart(2, '0'))
    .join('');
}

The salt is an environment variable in wrangler.toml. Even if someone gets access to the database, they can’t reverse the IPs without the salt. It’s not bulletproof, but it’s a lot better than storing 192.168.1.42 in plaintext.

Cleaning up expired rate limits

KV had a nice feature for this: expirationTtl. Set a TTL when you write a key, and KV deletes it automatically. D1 doesn’t have TTL. Rows stick around until you delete them.

I added a scheduled handler that runs once a day at midnight UTC. It deletes rate limit rows older than two hours in batches to avoid timeouts and keep D1 billing reasonable.

export default {
  fetch: app.fetch,
  async scheduled(event, env) {
    const cutoff = Date.now() - (2 * 60 * 60 * 1000);
    const BATCH_SIZE = 1000;

    while (true) {
      const result = await env.DB.prepare(`
        DELETE FROM rate_limits
        WHERE rowid IN (
          SELECT rowid FROM rate_limits
          WHERE window_start < ?1
          ORDER BY window_start
          LIMIT ?2
        )
      `).bind(cutoff, BATCH_SIZE).run();

      if (result.meta.changes < BATCH_SIZE) break;
    }
  }
};

The ORDER BY window_start LIMIT ?2 pattern is intentional. D1 charges per row read, so a batched subquery that uses the index on window_start is cheaper than a blanket DELETE WHERE window_start < cutoff that might scan the whole table.

Migrating the data

I had about 10 clap records in KV. Not exactly a big data migration, but I still wanted to do it properly.

I created a temporary admin endpoint that iterated through all claps:* keys in KV and upserted them into D1. Ran it once, verified the counts matched with wrangler d1 execute, then deleted the endpoint and removed the KV binding.

The wrangler config

The wrangler.toml changes are minimal. Swap the KV binding for a D1 binding and add a cron trigger.

name = "clap-backend"
main = "worker.js"

[vars]
ALLOWED_ORIGINS = "https://rezhajul.io,http://localhost:2222"
MAX_CLAPS_PER_REQUEST = "10"
MAX_CLAPS_PER_IP = "50"

[[d1_databases]]
binding = "DB"
database_name = "clap-backend"
database_id = "YOUR_D1_DATABASE_ID_HERE"
migrations_dir = "migrations"

[triggers]
crons = ["0 0 * * *"]

Was it worth it?

For a blog with my traffic, honestly, probably not. KV would have been fine for years. But the migration took about two hours, and now I don’t have to think about lost writes or storing raw IPs. D1’s free tier is generous enough that I’m not paying anything extra.

If you’re building something similar and expect any real concurrency, skip KV and start with D1. The atomic writes alone are worth it.