Implementing Read-Only Replicas for Embedded Dashboards

On a constrained edge device, an embedded dashboard (Qt, Electron, or a lightweight local HTTP frontend) shares one SQLite file with the background telemetry writer, and the UI thread stalls with SQLITE_BUSY the moment a sensor burst holds the WAL write lock. Query latency that must stay under ~10 ms for a responsive gauge spikes into hundreds of milliseconds, and an unclean power cycle can leave a foreground read blocked behind WAL recovery. This page solves that one scenario: giving dashboards an immutable, read-only replica so UI reads never touch the live write path. It is the read-side counterpart to the write-side routing described in the Fallback Routing Strategies cluster, and applies the deterministic-edge principles of the wider SQLite Architecture & Production Hardening discipline. The replica is a snapshot artifact — regenerated on a fixed cadence or a checkpoint threshold — so dashboard consumers never open an active -wal file.

Figure — Read-path isolation: telemetry writers own the primary database, while dashboards read only from an immutable replica snapshot regenerated by the backup API.

Diagnosis

Confirm the dashboard is actually contending with the writer rather than running slow queries. Three signals together are conclusive:

Error code. Dashboard reads intermittently return SQLITE_BUSY (code 5) or SQLITE_LOCKED (code 6), correlated in time with ingest bursts — not with heavy SELECTs.
Shared lock file. The dashboard and the writer map the same -shm file. Check the open handles on the primary:
```
# Any dashboard PID listed here shares the writer's WAL/-shm lock namespace
lsof /data/primary.db-shm
```

Latency tracks WAL size. Read latency rises as the -wal grows between checkpoints. Sample it while the writer runs:

# If this climbs and query time climbs with it, reads are blocked by the writer
stat -c '%s' /data/primary.db-wal

If all three hold, the dashboard is on the write path and needs an isolated replica. If reads are slow but never return SQLITE_BUSY, the problem is query planning or cache sizing, not contention, and a replica will not help.

Solution

The fix has three parts that must be applied together: keep the primary’s WAL bounded so snapshots are cheap, generate the replica with the Online Backup API (never a raw copy), and route dashboards through an immutable read-only connection.

First, keep the primary’s WAL from growing without bound so each snapshot is a quick, low-contention operation. Align the busy_timeout value with your busy_timeout configuration and the checkpoint cadence with your journaling modes baseline:

PRAGMA journal_mode=WAL;          -- concurrent readers + single writer on the primary
PRAGMA synchronous=NORMAL;        -- fsync at checkpoint; safe on power loss, only newest commit may roll back
PRAGMA wal_autocheckpoint=1000;   -- passive checkpoint every 1000 pages (~4MB at 4KB page); caps -wal growth
PRAGMA busy_timeout=5000;         -- 5s internal retry before SQLITE_BUSY surfaces to the ingest app
PRAGMA cache_size=-64000;         -- 64MB page cache; negative value = KiB

Then generate the replica. A raw cp/rsync can capture a mid-transaction page image or drag orphaned -wal/-shm files alongside it; the Online Backup API produces a transactionally consistent, WAL-free file. Snapshot it, then strip write permissions so nothing on the device can mutate it:

import sqlite3
import os

PRIMARY = "/data/primary.db"
REPLICA = "/data/replica.db"

def regenerate_replica(primary: str = PRIMARY, replica: str = REPLICA) -> None:
    src = sqlite3.connect(primary)                     # writer's DB, opened read-only-ish for backup
    dst = sqlite3.connect(replica)                     # fresh, WAL-free destination
    with dst:
        src.backup(dst)                                # atomic page-by-page copy; no -wal captured
    dst.close()
    src.close()
    os.chmod(replica, 0o444)                            # r--r--r--: no process can rewrite the snapshot

regenerate_replica()

On embedded Linux you can harden this further by placing the replica in a directory that is bind-mounted read-only, so even a root-owned dashboard cannot rewrite it. mount operates on directories, not regular files, so bind-mount the containing directory:

mount --bind /data/replica-dir /mnt/replica-ro     # expose the snapshot's directory
mount -o remount,ro,bind /mnt/replica-ro           # then drop it to read-only

Finally, route every dashboard connection through the immutable snapshot. immutable=1 tells the SQLite VFS the file will not change for the connection’s lifetime, so the pager skips all locking and -shm checks — the reason reads stop contending at all. Pair it with mode=ro and PRAGMA query_only=ON, then read the pragma back to prove the connection cannot write:

import sqlite3

# Dashboard backend (Electron main process, Qt worker, or local HTTP handler)
conn = sqlite3.connect(
    "file:/data/replica.db?immutable=1&mode=ro",       # immutable=1: no locking, no -shm mapping
    uri=True,
)
conn.execute("PRAGMA query_only=ON;")                  # reject any write path at the API layer

# Verify the guard actually took — a pool wrapper can silently drop URI params
assert conn.execute("PRAGMA query_only;").fetchone()[0] == 1, "query_only not enforced"

Verification

Prove the isolation holds before shipping. Four checks, each of which fails loudly if the setup is wrong:

import os, sqlite3

# 1. The replica directory must contain NO -wal / -shm companions.
for suffix in ("-wal", "-shm"):
    assert not os.path.exists("/data/replica.db" + suffix), f"orphaned {suffix} beside replica"

# 2. The snapshot must be a self-consistent database.
ro = sqlite3.connect("file:/data/replica.db?immutable=1&mode=ro", uri=True)
assert ro.execute("PRAGMA integrity_check;").fetchone()[0] == "ok"

# 3. A write MUST be rejected, not silently applied.
try:
    ro.execute("CREATE TABLE probe(x);")
    raise SystemExit("FAIL: replica accepted a write")
except sqlite3.OperationalError:
    pass  # expected: attempt to write a readonly database

# 4. The primary's -shm must NOT list any dashboard PID (run under load).
#    Shell: lsof /data/primary.db-shm  ->  only the writer process appears.

Under a live ingest burst, dashboard read latency should now stay flat regardless of writer activity, and SQLITE_BUSY should disappear from the UI logs entirely. If it does not, the connection is still opening the primary — grep your logs for the replica path to confirm the routing.

Failure Modes & Gotchas

immutable=1 serves stale data after regeneration. The flag promises SQLite the file will not change for the connection’s lifetime. If you overwrite replica.db while a dashboard still holds an immutable connection, that connection keeps serving pages from the old file (or reads garbage where the inode changed underneath it). Always regenerate to a temp path and swap, and only after closing existing immutable connections — or version the filename and point new connections at the latest. Never reuse an open immutable handle across a regeneration.

Figure — Why regeneration must swap inodes: overwriting the replica in place corrupts open immutable handles, while writing a temp file and atomically renaming it hands each handle a stable snapshot.

Orphaned -wal/-shm files turn the replica read-only-hostile. If a stray -wal lands next to a mode=ro database, SQLite tries to run recovery, cannot get a write lock, and fails the open with SQLITE_READONLY (or SQLITE_CANTOPEN). This is why the backup API — not a file copy — is mandatory, and why the replica must be owned by a non-privileged dashboard user under your filesystem permissions policy so no writer process can drop a journal there. Enforcing the surrounding Security Boundaries & Access Control keeps a compromised dashboard from creating those companions itself.

A connection pool silently drops the URI guards. Many pooling layers rebuild connections from a bare path and re-run their own PRAGMA set, discarding your ?immutable=1&mode=ro parameters and query_only. The pooled connection then opens the primary read-write, re-introducing exactly the contention you removed. Assert PRAGMA query_only on checkout (as in the Solution), and configure the pool’s connect string, not just the first hand-made connection. See connection pooling strategies for enforcing per-checkout invariants.

Checkpoint starvation on the primary if a dashboard ever holds a real read. If any consumer still reads the primary (for example a “live” panel that bypassed the replica), a long-running read transaction pins the WAL and blocks wal_checkpoint(TRUNCATE), so the -wal grows until the disk fills. Keep every dashboard on the replica; if a live view is unavoidable, bound its transaction duration and route write-blocking conditions through the fallback routing tiers rather than letting reads accumulate.

For quick reference, the return codes this pattern turns from crashes into handled conditions:

Code	Trigger	Fallback
`SQLITE_BUSY`	Dashboard still on the primary during a writer lock	Route to replica; exponential backoff on any residual primary read
`SQLITE_READONLY`	Write attempted on the replica, or orphaned `-wal` present	Enforce `query_only`; regenerate via backup API into a clean directory
`SQLITE_CANTOPEN`	Replica missing or permissions revoked mid-swap	Watchdog regenerates the snapshot before UI init; serve last-known cache
`SQLITE_IOERR`	Flash wear-out / filesystem corruption	Halt replica generation, alert the pipeline, restore from verified backup

When SQLITE_IOERR or a disk-full condition appears, halt replica generation immediately and never overwrite a corrupted primary from a replica — isolate the node, run a filesystem check, and restore from the last verified backup.

Fallback Routing Strategies — the parent cluster; write-side tiers that pair with this read-side isolation.
File System Permissions & Ownership — locking down the replica so no writer can drop a journal beside it.
Busy Timeout Configuration — sizing the internal retry window on the primary before contention surfaces.
Connection Pooling Strategies — keeping pooled dashboard connections from discarding the read-only guards.

Implementing Read-Only Replicas for Embedded Dashboards #

Diagnosis #

Solution #

Verification #

Failure Modes & Gotchas #

Related Pages #

Implementing Read-Only Replicas for Embedded Dashboards

Diagnosis

Solution

Verification

Failure Modes & Gotchas

Related Pages