Handling WAL File Bloat on Constrained Storage

On a desktop with a spare terabyte, a -wal file that grows to 200 MB is a curiosity. On an industrial gateway with a 4 GB eMMC partition, or a field sensor writing to an 8 GB SD card, the same growth is an outage: the partition fills, SQLite returns SQLITE_IOERR_WRITE wrapped around a kernel ENOSPC, and the ingestion process dies mid-transaction. The trap is that the WAL is supposed to shrink on its own — a checkpoint folds its frames back into the main database and the file is reused or truncated. When it does not shrink on constrained media, the cause is almost never disk speed; it is a checkpoint that runs but reclaims nothing because something is pinning the log. This page covers that exact scenario as one step within Checkpoint Frequency Tuning, part of the broader WAL Optimization & Concurrency Tuning discipline. It assumes the database is already in Write-Ahead Logging mode; if it is not, there is no -wal file to bloat and this is the wrong page.

Diagnosis

Confirm you have this problem — a checkpoint that runs but frees nothing — rather than simply an autocheckpoint threshold set too high. Two signals together are conclusive: the on-disk -wal size, and what a manual checkpoint reports back.

First, measure the log directly. The -wal file lives beside the database and its size in bytes is the ground truth; do not trust the page count alone:

import os

db_path = "/var/lib/telemetry/sensors.db"
wal_bytes = os.path.getsize(db_path + "-wal")   # the -wal sidecar, not the main db
print(f"-wal = {wal_bytes/1024:.0f} KiB")       # tens of MiB on constrained media == trouble

Second, run a checkpoint by hand and read its three-column result. PRAGMA wal_checkpoint returns (busy, log_pages, checkpointed_pages): busy is 1 when a reader blocked the operation, log_pages is how many frames are in the WAL, and checkpointed_pages is how many were actually written back to the database:

import sqlite3

conn = sqlite3.connect(db_path)
busy, log_pages, ckpt_pages = conn.execute("PRAGMA wal_checkpoint(PASSIVE);").fetchone()
print(f"busy={busy} log={log_pages} checkpointed={ckpt_pages}")

The signature of this problem is busy=1 with checkpointed_pages far smaller than log_pages — the checkpoint tried, was blocked, and left most of the log in place. A PASSIVE checkpoint never truncates the file even when it succeeds, so a growing -wal alongside a healthy autocheckpoint is expected until you switch to a truncating mode. The blocker is almost always a long-lived read transaction: SQLite cannot recycle frames past the oldest snapshot any reader still holds open, so one forgotten cursor on a dashboard query pins the entire log. Confirm by checking whether any connection has an open transaction; if busy drops to 0 the instant you close your readers, you have found it.

Solution

Replace passive, best-effort reclamation with a bounded TRUNCATE checkpoint that returns the file to zero bytes, and refuse to let the -wal grow past a hard ceiling between checkpoints. The focused routine below does exactly that and nothing tangential — it sets the two PRAGMAs that cap growth, forces a truncating checkpoint, and verifies the outcome on disk:

import os
import sqlite3
import logging

logger = logging.getLogger("wal_reclaimer")

def reclaim_wal(db_path: str, ceiling_bytes: int = 1_048_576) -> int:
    """Force the -wal file back to zero on constrained storage, with verification."""
    # isolation_level=None -> autocommit, so each PRAGMA runs immediately and is
    # never wrapped in an implicit BEGIN that would defer or discard it.
    conn = sqlite3.connect(db_path, isolation_level=None)

    # journal_size_limit is a post-checkpoint TRUNCATE target in BYTES: after a
    # checkpoint, any -wal larger than this is trimmed back to it. It does NOT
    # block growth mid-transaction, so it is a safety net, not the primary cap.
    conn.execute(f"PRAGMA journal_size_limit = {ceiling_bytes};")

    # wal_autocheckpoint in PAGES (default page = 4096 B). 256 pages ~= 1 MiB,
    # so the background checkpointer fires long before the partition fills.
    conn.execute("PRAGMA wal_autocheckpoint = 256;")

    # TRUNCATE mode does what PASSIVE will not: on success it shrinks the -wal
    # file to zero bytes, reclaiming the space rather than just marking it reusable.
    busy, log_pages, ckpt_pages = conn.execute(
        "PRAGMA wal_checkpoint(TRUNCATE);"
    ).fetchone()
    if busy:
        # A reader still pinned the log; TRUNCATE could not run to completion.
        logger.warning("checkpoint blocked: log=%d checkpointed=%d", log_pages, ckpt_pages)

    # Verify on disk, not from the return tuple: read the actual file size back.
    wal_bytes = os.path.getsize(db_path + "-wal") if os.path.exists(db_path + "-wal") else 0
    logger.info("post-checkpoint -wal = %d bytes", wal_bytes)
    conn.close()
    return wal_bytes

Two details make this hold on flash. journal_size_limit is expressed in bytes and only trims after a checkpoint, so it is the backstop that keeps a briefly oversized log from persisting — it is not what prevents growth. The TRUNCATE argument is the active ingredient: unlike PASSIVE, it collapses the file to zero on success, which also resets the storage controller’s wear counters instead of rewriting the same reused blocks. Run reclaim_wal on a timer during ingestion lulls, or trigger it from a watchdog when os.path.getsize crosses ~80% of your ceiling — do not wait for the partition to fill. For the calibration of how often to fire it against write velocity, see Optimizing wal_autocheckpoint for Continuous Logging.

Verification

Three checks, cheapest first.

First, prove the reclaim actually shrank the file. reclaim_wal returns the post-checkpoint size in bytes; assert it collapsed:

size = reclaim_wal(db_path)
assert size < 4096, f"-wal did not truncate: {size} bytes still present"

A value at or near zero means TRUNCATE ran to completion. A value still in the megabytes means a reader blocked it — jump to the gotchas below.

Second, read back the two caps you set, because a pool or ORM can silently reset them on a recycled handle:

limit = conn.execute("PRAGMA journal_size_limit;").fetchone()[0]
auto  = conn.execute("PRAGMA wal_autocheckpoint;").fetchone()[0]
assert limit == 1_048_576 and auto == 256, f"caps not applied: limit={limit} auto={auto}"

Third, confirm the file stays bounded under sustained writes rather than just once. Sample it in a loop while a writer runs and assert the ceiling holds:

import time

for _ in range(20):
    conn.execute("INSERT INTO readings(sensor, value) VALUES (?, ?);", ("bme280", 21.4))
    wal = os.path.getsize(db_path + "-wal")
    assert wal < 2_097_152, f"-wal breached 2 MiB during writes: {wal}"  # 2x ceiling headroom
    time.sleep(0.05)

If the file plateaus near your journal_size_limit and periodically drops to zero as autocheckpoints fire, the reclamation is working. A monotonic climb means growth is outpacing the checkpoint cadence — tighten wal_autocheckpoint or raise the trigger frequency, as covered in Threshold Tuning for High-Write Workloads.

Failure Modes & Gotchas

A long-lived reader pins the log and TRUNCATE reclaims nothing. This is the dominant cause and the one that survives every PRAGMA change. SQLite cannot recycle frames older than the oldest snapshot any open read transaction still references, so a dashboard query, an analytics cursor, or an ORM session left open across requests holds the entire -wal hostage — the checkpoint returns busy=1 and the file never shrinks. No journal_size_limit and no autocheckpoint value can override an active reader; the frames are simply not free to reclaim. Enforce bounded read-transaction lifetimes at the connection pooling layer and close read cursors promptly. The same discipline that fixes checkpoint starvation here is detailed under Reducing Lock Contention in Multi-Threaded Apps.

journal_size_limit is a post-checkpoint trim, not a hard cap. It is tempting to read PRAGMA journal_size_limit = 1048576 as “the WAL can never exceed 1 MiB.” It cannot do that. A single large transaction, or a burst arriving between checkpoints, will grow the -wal well past the limit while it is being written — the limit only trims the file back after a checkpoint completes. On a partition with little headroom, the transient spike is what triggers ENOSPC, not the steady-state size. Size the partition for the worst-case burst, and pair the limit with proactive size-triggered checkpoints rather than trusting the number as a ceiling.

Pooled or ORM handles quietly drop your PRAGMAs, and mmap can pin pages. journal_size_limit and wal_autocheckpoint are per-connection and reset on every new handle, so if a pool vends a connection that never ran reclaim_wal, the caps are absent and the read-back assertion above never fires — route every connection through one initializer and assert inside it. Separately, an aggressive memory-mapped I/O configuration can keep WAL pages mapped in the OS page cache, and on an SD card the controller’s own write reordering means a TRUNCATE that returns success may not have durably shrunk the file until the next fsync lands — verify with os.path.getsize after the operation settles, and keep active databases on local flash where advisory locking and fsync semantics are honored, never on a network mount.

Checkpoint Frequency Tuning — the parent guide: how checkpoint cadence trades WAL size against I/O overhead.
Optimizing wal_autocheckpoint for Continuous Logging — calibrate the page threshold that fires the background checkpointer.
Threshold Tuning for High-Write Workloads — when growth outpaces reclamation and the cadence itself needs work.
Configuring the synchronous PRAGMA for Crash Safety — the durability setting that bounds your data-loss window around each checkpoint.

Handling WAL File Bloat on Constrained Storage #

Diagnosis #

Solution #

Verification #

Failure Modes & Gotchas #

Related Pages #

Handling WAL File Bloat on Constrained Storage

Diagnosis

Solution

Verification

Failure Modes & Gotchas

Related Pages