ERPClaw Runs on SQLite or PostgreSQL. Why SQLite Is the Default
ERPClaw is database agnostic via PyPika. SQLite is the default for self-hosted installs, PostgreSQL is fully supported for enterprise workloads. Here is why.
ERPClaw works with SQLite or PostgreSQL. PyPika abstracts the query layer so the same code runs on either backend. Most installs use SQLite by default. Here is why.
This post is for the CTO or engineering manager who has shipped enough PostgreSQL-backed enterprise software to know why people pick it, and is now trying to evaluate whether a self-hosted ERP that defaults to SQLite is serious or naive. The short answer: SQLite is the default because it fits the workload of a 40 to 500 employee self-hosted install. PostgreSQL is a first-class option for the workloads that need it. The architecture supports both because the abstraction was built that way from day one.
How database agnosticism works in ERPClaw
Every database query in ERPClaw goes through a thin abstraction built on PyPika, a Python query builder that emits SQL for multiple dialects. PyPika is dialect-aware, so the same Python code that constructs a query against SQLite in a self-hosted install constructs the equivalent query against PostgreSQL in an enterprise deployment. There is no second codebase, no dialect-specific branching scattered through the action handlers, and no rewrite when a customer crosses the threshold where PostgreSQL is the better fit.
Switching engines is a configuration change, not a refactor. The connection layer reads the target backend from environment configuration, hands back a connection object that honors the same interface, and the rest of the application is unaware of which engine is underneath. Schema DDL ports with mechanical translation. The 12-step GL validation, the Decimal-based money handling, the immutable GL semantics, the constitutional rules engine all run unchanged on either backend.
The rest of this post argues SQLite is the right default. The point of starting here is that you do not have to take that on faith: if the workload outgrows the default, the migration path is mechanical, not architectural.
The reflex to pick PostgreSQL
Every back-end engineer who has built a multi-user system in the last fifteen years has the same default. Pick PostgreSQL. It is the safe answer, the answer that survives an enterprise architecture review, the answer the RFP expects.
That reflex is correct for a wide range of workloads. PostgreSQL is an exceptional database. It scales horizontally with read replicas, it has the best feature set of any open-source RDBMS, and the operational story around it (backups, failover, monitoring) is mature. For a SaaS product with thousands of concurrent writers across hundreds of tenants, PostgreSQL is the right answer, and ERPClaw supports it for exactly those deployments.
The reflex becomes worth questioning when the workload no longer matches the reasoning. A self-hosted ERP for a 40 to 500 employee company has roughly these characteristics: one process writing at a time, bursts of dozens of reads in parallel, total daily write volume well under 100,000 rows, total database size under 10 GB for the first several years, and a single operator who is not a DBA. Most of the assumptions that make PostgreSQL the obvious answer do not apply, which is why SQLite is the default.
Three forces keep PostgreSQL as the assumed default even when SQLite is the better fit:
- Perception of “real” databases. Many developers were taught that SQLite is for mobile apps and CI fixtures. SQLite the project has never tried very hard to correct this perception, and the result is a generation of engineers who believe a single file database cannot run production.
- Concurrency myths. The claim that SQLite “cannot handle concurrent writes” is technically true and practically misleading. It serializes writes within a single process. For a workload where writes are already being serialized by the application, this is a feature, not a limitation.
- Vendor RFP requirements. Enterprise procurement forms ask for PostgreSQL or Oracle by name. When the procurement reflex matters, ERPClaw runs on PostgreSQL and the form gets the answer it expects.
The first two are outdated assumptions. The third is a real constraint, and the dual-backend architecture is how ERPClaw addresses it directly.
What SQLite actually does well in 2026
The SQLite documentation has a famous page called Appropriate Uses For SQLite that has not changed much in years, because the answer has not changed much. It is worth re-reading with fresh eyes if your last encounter with SQLite was a Django tutorial.
WAL mode and concurrent reads. SQLite in Write-Ahead Logging mode gives you concurrent readers and a single writer with no reader blocking. A reporting query running against the trial balance does not block a sales invoice submission, and vice versa. This is the configuration ERPClaw runs in production:
conn.execute("PRAGMA journal_mode = WAL")
conn.execute("PRAGMA synchronous = NORMAL")
conn.execute("PRAGMA foreign_keys = ON")
conn.execute("PRAGMA busy_timeout = 5000")
Four pragmas. That is the entire production tuning surface for a database that handles double-entry bookkeeping, inventory, payroll, and 46 modules of business logic.
Foreign key enforcement. SQLite supports declarative foreign keys with cascade rules, and once you set PRAGMA foreign_keys = ON, the enforcement is identical to what you would expect from PostgreSQL. The reason this pragma is off by default is historical compatibility, not a missing feature.
ACID guarantees that match most managed cloud databases. SQLite is fully ACID compliant. Its durability story (when configured correctly) is stronger than several managed cloud databases that quietly default to weaker consistency for performance. A COMMIT in SQLite means the bytes are on disk. A COMMIT in some hosted Postgres-flavored services means the bytes are on the way to disk on a replica that may not have acknowledged yet.
No network round trip. Every query in SQLite is an in-process function call. There is no socket, no protocol parsing, no connection pool tuning, no PgBouncer to deploy. For a workload where the database and the application live on the same machine, this is a 100x latency improvement on small queries that happen thousands of times per workflow.
Operational simplicity. A SQLite database is a file. Backup is cp data.sqlite backup.sqlite while the database is in WAL mode. Restore is the inverse. There is no pg_dump, no wal_archive, no replication slot to manage. For a self-hosted ERP whose operator is the founder of a 50 person company, this matters more than any feature comparison.
The result is a database that, for the workload most ERPClaw installs have, is faster, simpler, more durable, and cheaper to operate than a managed alternative. The reason most engineers do not believe this is that they are reasoning about a different workload, the one PostgreSQL is the right answer for.
Where SQLite falls short (and PostgreSQL takes over)
This is the section the rest of the internet skips, so let me be specific. Each of the following is a real limit of SQLite, and each is the reason the PostgreSQL backend exists in the same codebase.
Write concurrency under contention. SQLite serializes writes. If two processes try to begin a write transaction at the same time, the second one waits up to busy_timeout milliseconds and then fails with SQLITE_BUSY. For a workload where many processes are writing concurrently (a public web app with thousands of users hitting the same tables), this is a real limitation, and the fixes (sharding, queueing, retries) get awkward fast. PostgreSQL is the right answer.
Massive parallel write throughput. SQLite tops out at low thousands of writes per second on commodity hardware, single-process. PostgreSQL with proper tuning can sustain tens of thousands of writes per second across many connections. For workloads that cross that threshold, switch the backend.
Read replicas. SQLite has no native replication story. You can use Litestream for streaming backups to S3, and projects like LiteFS provide replicated SQLite, but the ergonomics are nowhere near what PostgreSQL gives you for free with streaming replication. If your compliance regime or DR posture requires synchronous standbys, run PostgreSQL.
Network access. SQLite is an embedded library. If you need a database that multiple application servers connect to over the network, SQLite is not the answer without an additional layer. PostgreSQL is built for that shape.
Server-side functions and extensions. PostgreSQL has stored procedures, triggers, JSON operators, full-text search, GIS, and a thousand extensions. SQLite has a useful but smaller set. For analytical workloads or anything that wants the database to do heavy lifting, PostgreSQL wins.
If your workload looks like any of those, configure ERPClaw with the PostgreSQL backend. The next two sections explain why most self-hosted ERPClaw workloads do not look like any of those, which is why SQLite is the default rather than the only option.
How ERPClaw absorbs the SQLite tradeoffs
The default deployment was designed around the SQLite write model, not in spite of it. The architectural choices that make this work are visible in the developers documentation and reproducible in the open-source codebase.
Single-process write serialization. Every ERPClaw write goes through a single Python process talking to a single SQLite file. The AI agent submits an action, the action runs as one transaction, and the next action waits for the first to commit. Submit operations in an ERP are inherently transactional (a sales invoice writes to sales_invoice, gl_entry, stock_ledger_entry, and journal_entry atomically), and serializing them is the correct semantics. The application would have to serialize them anyway to satisfy the 12-step GL validation, so letting SQLite do it for free is a win.
5000 ms busy_timeout. The PRAGMA busy_timeout = 5000 setting tells SQLite to wait up to five seconds for a lock before failing. For a workload where the longest write transaction is on the order of tens of milliseconds, that headroom means SQLITE_BUSY is essentially never observed in production. When it does happen, it is a signal of a bug (a transaction held open across an external API call, for example).
Concurrent reads under WAL. Reporting queries, dashboard refreshes, and the AI agent’s read-side scans all run concurrently with writes thanks to WAL mode. A trial balance query that takes 200 ms does not block the next sales invoice submission. This is the property that makes the single-writer model acceptable for an interactive application.
The PostgreSQL backend, ready when needed. Because PyPika and the connection layer abstract the dialect, the move to PostgreSQL is a configuration change, not a rewrite. The full migration playbook is on the roadmap as a future engineering post (scaling-erpclaw-from-sqlite-to-postgres), and the features overview documents both stores.
The operational benefits of the SQLite default
I want to give you the operational picture, because this is where the SQLite case stops being theoretical and starts mattering to whoever is going to run the system at 2 a.m. on a Saturday.
| Operation | SQLite (ERPClaw default) | PostgreSQL (typical self-hosted) |
|---|---|---|
| Cold start | < 50 ms | 5 to 30 seconds |
| Backup | cp data.sqlite backup.sqlite | pg_dump with role and password setup |
| Restore | cp backup.sqlite data.sqlite | pg_restore with target DB created first |
| Disk footprint (empty) | 1 MB | 40 to 60 MB |
| RAM at idle | < 10 MB | 100 to 300 MB depending on config |
| Network exposure | none (file on disk) | port 5432, needs firewall and auth |
| Connection pooling | not needed | usually requires PgBouncer |
| Schema migration tooling | direct DDL | Alembic or similar |
| Operator skill required | basic Unix | DBA familiarity helpful |
Some of these are unfair to PostgreSQL on a properly managed cluster, and some are unfair to SQLite (the lack of network exposure is a feature, not a deficiency). The point is the cumulative shape: the SQLite default eliminates entire categories of operational concern. There is no DBA. There is no replication lag. There is no pg_hba.conf to tune. The dev / prod parity is total, because dev and prod use the same single-file database and the same four pragmas. When you switch to the PostgreSQL backend, you take on the operational surface in exchange for the capabilities that come with it.
Self-hosted ERP runs on a laptop
An ERPClaw instance with a year of transactions, twenty users, three integrations, and a full GL fits in a few hundred megabytes on a developer’s laptop. You can copy it to a USB stick. You can run it offline. You can run a full integration test against a fresh fixture in under thirty seconds.
That is not because ERPClaw is small. The schema is 789 tables across 46 modules, the action surface is over 3,126 actions, the test suite has more than 7,000 passing assertions. It is because SQLite makes the runtime overhead of a database engine almost zero, and because a single file is a fundamentally different operational primitive than a database server. SQLite is what makes clawhub install erpclaw work as a one-line setup. PostgreSQL is the right backend for a public SaaS with 50,000 tenants, which is why it lives behind a backend flag for the deployments that want it.
When PostgreSQL is the right backend
The honest version of any architecture post needs a section that says “default to the other option, in these cases.” For ERPClaw, the cases are clear, and PostgreSQL is a first-class choice for each of them.
Enterprise multi-tenant deployments with shared database. If you are running ERPClaw as a hosted product where many tenants share one database and you need row-level security, sophisticated indexing across tenant boundaries, and online schema changes that do not block writers for long stretches, PostgreSQL is the right backend. SQLite per-tenant works for some shapes (one file per tenant) but the ergonomics get awkward past a few hundred tenants.
More than 1 million write events per day on a single instance. SQLite can absorb a lot more than people assume, but if your sustained write rate puts you in the millions per day on a single database, PostgreSQL with proper tuning, connection pooling, and partitioning is the right answer. Configure the backend at install time and the rest of the application does not change.
Hard read-replica requirements. If your workload requires geographically distributed read replicas with bounded lag, or if your compliance regime requires synchronous replication to a standby, PostgreSQL has spent twenty years getting this right. ERPClaw rides on top of that maturity through the same abstraction layer.
Regulatory backup and replication policies. If your compliance posture mandates point-in-time recovery with a specific RPO, WAL archiving to a managed store, or audited replication to a regulated jurisdiction, PostgreSQL’s tooling is the path of least friction. Run the PostgreSQL backend and inherit the ecosystem.
Heavy analytical queries against the OLTP store. SQLite can do joins and aggregations at respectable speed, but it does not have a query planner that handles 200 line analytical queries the way PostgreSQL does. If your reports are doing serious analytics over millions of rows, run PostgreSQL or, better, replicate to a separate analytical store like DuckDB or ClickHouse.
Many writer processes. If your application architecture has multiple processes writing to the same database concurrently and you cannot serialize them through a single front-end, PostgreSQL is the right backend. The contention model is what you want.
ERPClaw’s roadmap includes a managed cloud edition where the workload genuinely will cross some of these thresholds. That edition runs on PostgreSQL via the abstraction layer described earlier. The single-tenant self-hosted edition stays on SQLite as the default because that is what the workload calls for. One application, two stores, no compromise.
FAQ
Is SQLite really safe for production financial data?
Yes, when configured correctly. SQLite is one of the most thoroughly tested pieces of software on Earth. The test suite runs to over 100x the size of the source code. With WAL mode, synchronous = NORMAL, and foreign keys enabled, the durability guarantees match what you would expect from any serious RDBMS. The constitutional rules engine in ERPClaw adds another layer: every GL posting is verified against twelve invariants before commit, and the trial balance is checked across the entire database after every test run. The same invariants run against the PostgreSQL backend, because they sit above the abstraction layer.
How does ERPClaw stay portable between SQLite and PostgreSQL?
PyPika constructs queries in a dialect-aware way, and the connection layer hands the application a connection object that honors the same interface regardless of backend. Action handlers do not know which engine is underneath. Schema DDL is the only place where the two backends diverge meaningfully, and that divergence is mechanical translation, not redesign.
What happens when I outgrow SQLite?
You switch the backend to PostgreSQL. The query layer is dialect-aware, the schema migrates with mechanical translation, and the application code is unchanged. The trigger condition is sustained write contention or a hard requirement that SQLite cannot satisfy (replication, multi-writer, regulatory tooling), not raw size. SQLite databases of tens of gigabytes are common in production.
How do you back up a SQLite database that is being written to?
Use the SQLite backup API, or rely on WAL mode and copy the main file plus the -wal and -shm sidecars. ERPClaw’s recommended backup script does both: a hot snapshot on a schedule, plus an off-host copy via litestream for disaster recovery. For PostgreSQL deployments, the standard pg_basebackup plus WAL archiving stack applies.
Why not DuckDB?
DuckDB is excellent for analytical workloads, and ERPClaw will likely use it for reporting against historical data. For OLTP (the transactional submit path that posts to the GL), SQLite is the default and PostgreSQL is the alternative. DuckDB is complementary to both, not a replacement.
Does SQLite support JSON columns?
Yes, since 3.38 it has the JSONB-equivalent functions, and from 3.45 it has true binary JSON storage. ERPClaw uses JSON columns for flexible audit metadata; the rest of the schema is normalized. PostgreSQL’s JSONB is more capable, and ERPClaw uses the richer feature set when the backend is PostgreSQL.
How do you handle schema migrations without ALTER TABLE limitations?
SQLite’s ALTER TABLE is more limited than PostgreSQL’s, but the workarounds (create new table, copy data, drop old, rename) are well understood and easy to script. ERPClaw’s init_db.py per module owns the schema for that module’s tables, and migrations are versioned alongside the module. On the PostgreSQL backend, migrations use direct ALTER TABLE where possible.
Closing
Picking SQLite as the default is not a contrarian flex, and it is not an exclusion of PostgreSQL. It is what happens when you let the workload pick the database instead of the other way around, and then build the abstraction so the workload can change its mind. For a self-hosted, AI-native ERP that targets the 40 to 500 employee mid-market, SQLite is the right default for the same reason a single binary is the right deployment target: simplicity compounds when the operator is not a specialist. For the deployments that need PostgreSQL, the backend is there, fully supported, and a config change away.
If you want to see the architecture in practice, the ERPClaw codebase is open source and the entire database layer (PyPika abstraction, pragmas, transaction boundaries, GL validation pipeline) is in the open. The pricing page is short for the self-hosted edition, because the answer is zero. If you have a workload you think breaks the SQLite default, run the PostgreSQL backend; that is what it is for.
Related posts
Agency Accounting, Project P&L, and Time Billing: A Real Guide
Why your friendly retainer client is secretly losing you $40 an hour, the four metrics every agency owner should know, and how to fix project P&L.
A2X Alternative: The Free Open-Source Tool Most Shopify Stores Don't Know About
Looking for an A2X alternative? ERPClaw uses the same clearing account method, books every order separately, includes per-warehouse stock costs, and costs $0. A founder's honest comparison.
AI Decorated vs AI Native Software: Why Most AI Features Will Lose
AI-decorated tools bolt a chatbot onto 2015 software and charge a new fee. AI-native software rebuilds the architecture. One of these wins. Here is why.