Reducing Lock Contention in Multi-Threaded Apps

A single SQLite database can only ever have one writer at a time — that is a design invariant, not a tunable. So when an industrial gateway, a desktop automation daemon, or a Python worker pool runs several threads that all write, the threads do not write in parallel; they queue for one write lock, and every thread that loses the race gets SQLITE_BUSY (error 5) or, mid-statement, SQLITE_LOCKED (error 6). Teams usually respond by scattering retry loops and raising timeouts, which converts hard errors into latency spikes and watchdog freezes without removing the contention. The durable fix is architectural: make the application serialize writes deterministically instead of letting the database serialize them by rejection. This page addresses that exact scenario as one step within Connection Pooling Strategies, part of the broader WAL Optimization & Concurrency Tuning discipline. It assumes the database is already in Write-Ahead Logging mode so that readers never block the writer; if you are still on the rollback journal, fix that first.

Diagnosis

Confirm you have this problem — write-lock contention between threads — and not, say, a single long reader stalling a checkpoint. Three signals together are conclusive.

First, the error itself. Under multi-threaded write load the failure surfaces as sqlite3.OperationalError: database is locked (SQLITE_BUSY) or database table is locked (SQLITE_LOCKED), and it appears only when concurrency rises. If a single-threaded run of the same workload never trips it, the database is not slow — your threads are colliding on the write lock.

Second, prove the writers are actually serialized against each other. Instrument the time each thread spends inside BEGIN IMMEDIATE; the reservation is where the lock is acquired, so blocked writers pile up there:

import time, sqlite3, logging

log = logging.getLogger("lock_probe")

def timed_write(conn: sqlite3.Connection, sql: str, params=()):
    t0 = time.monotonic()
    conn.execute("BEGIN IMMEDIATE;")          # takes the write lock NOW, not at first write
    waited = time.monotonic() - t0            # time spent blocked waiting for the lock
    conn.execute(sql, params)
    conn.execute("COMMIT;")
    if waited > 0.05:                         # >50ms blocked == real contention, not noise
        log.warning("writer blocked %.0fms acquiring write lock", waited * 1000)

If those warnings fire under load and vanish when you drop to one writer thread, the diagnosis is settled. A rising waited value that scales with thread count is the fingerprint of lock contention specifically.

Third, rule out the impostor. PRAGMA wal_checkpoint(PASSIVE); returning busy=1 points at a long-lived reader pinning the log — a different failure covered in Handling WAL File Bloat on Constrained Storage. Lock contention between writers, by contrast, shows up as blocked BEGIN IMMEDIATE calls, not blocked checkpoints. A frequent root cause hides here too: one sqlite3.Connection object shared across threads with check_same_thread=False. That does not add concurrency — it multiplexes every thread onto one serialized handle and can raise SQLITE_MISUSE or silently corrupt cursor state. Grep for check_same_thread=False before anything else.

Solution

Stop letting the database arbitrate writes by rejection. Route every write through a single dedicated writer thread fed by a queue, and give each reader its own connection so readers proceed in parallel against WAL snapshots. The writer never contends with itself because there is only one of it, so SQLITE_BUSY between your own threads becomes structurally impossible.

Figure — Routing writes through a single serialized writer while readers use isolated per-thread connections removes the lock-upgrade races that produce SQLITE_BUSY/SQLITE_LOCKED.

The routine below is the whole pattern and nothing tangential: one writer thread draining a queue.Queue, and thread-local reader connections created on demand.

import sqlite3, threading, queue

# --- one initializer, applied identically to every connection this app opens ---
def _open(db_path: str) -> sqlite3.Connection:
    conn = sqlite3.connect(db_path, isolation_level=None)   # autocommit; we manage txns explicitly
    conn.execute("PRAGMA journal_mode=WAL;")                # readers never block the single writer
    conn.execute("PRAGMA synchronous=NORMAL;")              # fsync at checkpoint, not per-commit
    conn.execute("PRAGMA busy_timeout=5000;")               # internal backoff instead of instant SQLITE_BUSY
    conn.execute("PRAGMA foreign_keys=ON;")                 # per-connection; off by default, easy to forget
    return conn

class WriterQueue:
    """Serializes ALL writes onto one thread so writers never contend with each other."""
    def __init__(self, db_path: str):
        self._db_path = db_path
        self._q: "queue.Queue" = queue.Queue()
        self._thread = threading.Thread(target=self._run, daemon=True)
        self._thread.start()

    def _run(self):
        conn = _open(self._db_path)                         # the ONE writer handle, owned by this thread
        while True:
            job = self._q.get()                             # blocks until a write is submitted
            if job is None:                                 # sentinel -> shut down cleanly
                conn.close(); return
            sql, params, done = job
            try:
                conn.execute("BEGIN IMMEDIATE;")            # reserve the write lock up front
                conn.execute(sql, params)
                conn.execute("COMMIT;")
                done.set_result(None)
            except Exception as exc:                        # roll back so the next job starts clean
                conn.execute("ROLLBACK;")
                done.set_exception(exc)

    def write(self, sql: str, params=()):
        from concurrent.futures import Future
        done: "Future" = Future()
        self._q.put((sql, params, done))                    # any thread may submit; only _run executes
        return done.result()                                # blocks caller, re-raises writer-side errors

# --- readers: one connection per thread, never shared, never queued ---
_local = threading.local()
def reader(db_path: str) -> sqlite3.Connection:
    conn = getattr(_local, "conn", None)
    if conn is None:
        conn = _local.conn = _open(db_path)                 # created lazily, once per worker thread
    return conn                                             # WAL snapshot reads run fully in parallel

Every thread calls writer.write(...) from wherever it likes; the calls are enqueued and executed one at a time on the writer thread, so the write lock is never contested by your own code. Reads take a different path entirely — reader(db_path) hands back a connection bound to the calling thread via threading.local, and because WAL gives each reader a consistent snapshot, hundreds of reads run concurrently while the single writer commits. The busy_timeout=5000 is a backstop for other processes touching the file, not a crutch for your own threads.

Verification

Three checks, cheapest first.

First, prove the writer is genuinely singular — assert every committed write ran on the same thread id:

seen = set()
def _audit(conn):
    conn.execute("INSERT INTO audit(tid) VALUES (?);", (threading.get_ident(),))
# after a concurrent run, the audit table must contain exactly one distinct tid:
tids = {row[0] for row in reader(db_path).execute("SELECT DISTINCT tid FROM audit;")}
assert len(tids) == 1, f"writes ran on {len(tids)} threads; serialization is broken"

Second, read back the PRAGMAs on a reader connection, because a pool or ORM can hand you a recycled handle that never ran _open:

c = reader(db_path)
assert c.execute("PRAGMA journal_mode;").fetchone()[0].lower() == "wal"
assert c.execute("PRAGMA busy_timeout;").fetchone()[0] == 5000

Third, drive real contention and assert zero lock errors escape. Fan out many writer threads and require a clean run:

import concurrent.futures as cf

w = WriterQueue(db_path)
def hammer(n): w.write("INSERT INTO readings(v) VALUES (?);", (n,))

with cf.ThreadPoolExecutor(max_workers=32) as ex:
    errs = [f.exception() for f in [ex.submit(hammer, i) for i in range(5000)]]
assert not any(errs), f"{sum(e is not None for e in errs)} writes failed under load"

If 32 threads pushing 5,000 writes produce zero OperationalError, the contention is gone — the writes were serialized cleanly rather than rejected. A run that still raises database is locked means a write is escaping the queue; find the code path calling execute on a connection directly instead of going through write().

Failure Modes & Gotchas

A stray direct write bypasses the queue and reintroduces the race. The pattern only holds if every mutation goes through WriterQueue.write. One background job, migration script, or ORM session.commit() that opens its own connection and writes creates a second writer — and now the queue’s writer and the stray writer contend for the lock exactly as before, except the failure is intermittent and hard to reproduce. Enforce it structurally: keep the only write-capable _open behind the queue, hand application code read-only connections (sqlite3.connect(db_path, ..., uri=True) with ?mode=ro), and route schema changes and VACUUM through the same single writer during a maintenance window. Never run DDL concurrently with the writer thread.

busy_timeout masks contention instead of removing it, and can deadlock a self-blocking design. Raising busy_timeout feels like a fix because the errors stop — but the threads are still serialized, just now they sleep on the lock instead of failing on it, converting a visible error into invisible tail latency and watchdog timeouts. Worse, if you keep multiple writer connections and one holds a BEGIN IMMEDIATE while another waits, the timeout is the only thing preventing a permanent stall, and a SQLITE_BUSY_SNAPSHOT can still surface on commit. The single-writer queue removes the need for a long timeout entirely; keep busy_timeout modest (a few seconds) purely as a cross-process backstop, not as your concurrency strategy.

Shared handles and pool resets silently defeat isolation. Passing one sqlite3.Connection between threads with check_same_thread=False does not parallelize anything — it funnels every thread onto one serialized handle and risks SQLITE_MISUSE or cursor corruption; use the per-thread threading.local readers instead. Separately, the WAL and busy_timeout PRAGMAs are per connection and reset on every fresh handle, so a connection pool that vends a socket which never ran _open will quietly run in the default rollback journal with a zero timeout, and the readback assertion above will catch it only if you actually run it. Route every connection — reader and writer alike — through the one initializer, and for the async case offload the blocking write() call to an executor as described in Async Execution Patterns rather than calling it on the event loop thread.

Connection Pooling Strategies — the parent guide: per-thread connection lifecycle and why one handle per worker is the baseline.
Async Execution Patterns — offloading the blocking writer call off an asyncio event loop without stalling it.
Handling WAL File Bloat on Constrained Storage — the reader-side failure that looks similar but starves the checkpoint instead of the writer.
PRAGMA Optimization Guide — the full connection PRAGMA stack behind the _open initializer used here.

Reducing Lock Contention in Multi-Threaded Apps #

Diagnosis #

Solution #

Verification #

Failure Modes & Gotchas #

Related Pages #

Reducing Lock Contention in Multi-Threaded Apps

Diagnosis

Solution

Verification

Failure Modes & Gotchas

Related Pages