Storage modes¶
Audience: developers choosing how ModelVault persists data.
ModelVault is an embedded database: you open it inside your process as either a durable file or an in-memory image. There is no separate server to configure. This page covers on-disk deployment (the default), in-memory testing, and the planned hybrid/streaming mode for very large queries.
Background: Why ModelVault ยท vs JSON files
Current status (1.0)¶
| Mode | Status |
|---|---|
| On-disk | โ Open/create, superblocks, checksummed segments, schema catalog, indexes, typed queries, transactions |
| In-memory | โ
Same APIs (open_in_memory / Database::open_in_memory; await AsyncDatabase.open_in_memory() for asyncio) + snapshot export/import |
| Hybrid / streaming | ๐ Roadmap โ buffer pool + bounded-memory query operators |
Engine layout: Rust crate layout ยท Plans: ROADMAP
On-disk (default)¶
One .modelvault file โ durable embedded persistence, โship a file with your app.โ
db = modelvault.Database.open("app.modelvault")
# asyncio / FastAPI:
# db = await modelvault.AsyncDatabase.open("app.modelvault")
let db = Database::open("app.modelvault")?;
Format details: On-disk file format.
When to choose this
Any production app that needs durability without a separate database server.
In-memory (fast, explicit snapshot)¶
Same logical API; state lives in RAM.
| Property | Behavior |
|---|---|
| Durability | None implicit โ snapshot when you choose |
| Speed | No steady-state I/O latency |
| Use cases | Tests, prototypes, ephemeral UI state |
db = modelvault.Database.open_in_memory()
data = db.snapshot_bytes() # save
db2 = modelvault.Database.open_snapshot_bytes(data) # restore
let db = Database::open_in_memory()?;
Hybrid & streaming (planned)¶
For datasets larger than RAM while staying embedded.
Hybrid buffered (pager / buffer pool)¶
- Database remains a normal on-disk
.modelvaultfile - Internal buffer pool loads pages/segments on demand
- Dirty data written back to the same file
SQLite-style: hot working set in RAM, durable backing storage.
Streaming query execution¶
Pull-based operators for scans/filters/limits; bounded-memory aggregations and joins (spillable strategies).
Spill location (planned default)¶
Internal temporary segments inside the same .modelvault file โ preserves the single-file mental model.
Decision guide¶
flowchart TD
A[Need durability?] -->|Yes| B[On-disk]
A -->|No| C[In-memory + snapshots]
D[Dataset >> RAM?] --> E[Hybrid/streaming when available]
B --> F[Production apps]
C --> G[Tests & prototypes]
| Need | Mode |
|---|---|
| Durability | On-disk |
| Speed + explicit save points | In-memory + snapshots |
| Beyond-RAM workloads | Hybrid (when shipped) |